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(57) The present invention is directed to serve ligand 
screening apparatuses, ligand screening methods, pro- 
grams and recording medium for studying the binding 
analysis between a receptor including an induced-fit type 
receptor and a ligand. First, analysis and calculation of 
normal mode are conducted, and then fluctuation of di- 
hedral angle of main chain in a steady state is calculated. 
Then by carrying out molecular dynamiccalculation while 
imposing constraint on each atom based on the fluctua- 
tion, dynamic structure of the receptor is predicted more 
accurately. By using the dynamic structure obtained in 
the molecular dynamic calculation and an interaction 
function, receptor/ligand binding which is also applicable 
to an induced-fit type receptor is predicted with high ac- 
curacy. 
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Description 

TECHNICAL FIELD 

5 [0001] The present invention relates to ligand screening apparatuses, ligand screening methods, programs and re- 
cording medium using protein 3D structure coordinates, and more specifically to a ligand screening apparatus, a ligand 
screening method, a program and a recording medium for predicting a ligand that is considered as being involved in 
interaction, for a protein having known 3D structure coordinate. 

10 BACKGROUND ART 

[0002] Proteins required for maintenance of biological functions such as enzymes and receptors have properties called 
"substrate specificity", and such proteins are classified into "Lock&Key" type wherein an active site constantly remains 
unchanged to details of structure of substrate molecule, and "Induced-Fit" (induced-bonding) type wherein an active site 
is is in a random inactive state in the absence of a substrate, and the active site changes into an active state in the presence 
of a substrate for capturing the coming substrate. By the term induced-fit type, such a receptor is contemplated that 3D 
structure of a ligand binding site changes in binding with a ligand to allow intake of the ligand. 

[0003] As a computational chemical technique for screening for ligand molecules using 3D structure of protein, 3D 
compound database screening (Virtual Screening) as reported in AutoDocK ("Morris, G.M. Goodsell, D.S. Halliday, R.S. 

20 Huey, R. Hart, W.E. Belew, R.K. Olson, A.J. (1998) Automated docking using a Lamarckian genetic algorithm and an 
empirical biding free energy function. J. Comput. Chem. 19:1639-1662; Goodsell, D.S. Morris, G.M. Olson, A. J. (1996) 
Automated docking of flexible ligands: applications of AutoDock. J. Mol. Recognit, 9: 1 -5"), DOCK ("Ewing, T.J. I Makino, 
S. Skillman, A. G. Kuntz I. D. (2001) DOCK 4.0: search strategies for automated molecular docking of flexible molecule 
databases. J. Comput. Aided Mol. Des. 15: 41 1-28"), FlexX ("Rarey, M, Kramer, B, Lengauer, T, Klebe G. (1996) Afast 

25 flexible docking method using an incremental construction algorithm. J. Mol. Biol. 261: 470-89; Rarey, M. Wefing, S. 
Lengauer, T. (1996) Placement of medium-sized molecular fragments into active sites of proteins. J. Comput. Aided 
Mol. Des. 10: 41-54"), GOLD (" Jones, G. Willett, P. Glen, R. C. (1995) Molecular recognition of receptor sites using a 
genetic algorithm with a description of desolvation. J. Mol. Biol. 245:43-53; Jones, G. Willett, P. Glen, R.C, Leach, A.R. 
Taylor, R. (1997) Development and validation of a genetic algorithm for flexible docking. J. Mol. Biol. 267: 727-48"), 

30 ADAM&EVE ("Mizutani, M.Y. Itai, A. (2004) Efficient method for high-throughput virtual screening based on flexible 
docking: discovery of novel acetylcholinesterase inhibitors. J. Med. Chem. 47: 4818-4828") are known. These are also 
called "High-performance docking study" and enable a mass-scale compound library screening. However, ability of 
these techniques to predict binding conformation and binding energy is poor because rough approximation is used for 
evaluation. In addition, since they fail to consider computational expression parameters corresponding to "induced- 

35 binding" which is very important for binding between protein and ligand, and even if such consideration is made, it is to 
such an extent that random numbers are generated and side chains of receptor are moved, and accuracy of computation 
result is not sufficient. 

[0004] As a method for simulating "induced-binding" which is important for binding between protein and ligand, MD 
(molecular dynamic calculation), MM (molecular mechanical calculation), and MC (Monte Carlo method) are known. 

40 These methods provide relatively high accuracy, and enable prediction of binding conformation and binding energy. 
Here, the technique called "molecular dynamic method (MD)" calculates dynamic structure of a molecule by sequentially 
solving dynamic equation based on the classical dynamics for each atom constituting the molecule, and enables simu- 
lation of dynamic behavior of a protein with high accuracy. However, it is not necessarily useful means because it requires 
significant time for calculation and has difficulty in handling many molecules. Further, molecular dynamic calculation 

45 executed for a target protein according to such a conventional method results in a protein 3D structure whose coordinate 
is largely different from those analyzed by X-ray, NMRandthe like. Although such difference includes a physicochemical 
description of dynamic behavior of protein, it sometimes behaves contradictorily to an experimental result of dynamic 
behavior proved by NMR or the like, and hence it often fails to provide accurate simulation result. 
[0005] As described above, in respect of the conventional "in silico screening", since computational expression pa- 

50 rameters corresponding to "induced-binding" which is very important for binding between protein and ligand are not 
sufficiently considered, it does not deem that the accuracy of calculation result is adequate. 

[0006] On the other hand, in molecular simulation, it is possible to express and analyze the above induced-binding; 
however, significant time is required for obtaining an accurate result. Many results will be influenced by the initial structure 
coordinate. 

55 [0007] Inventors of the present invention examined the way of screening for a ligand that will bind to a target protein 
when 3D structure of a certain protein is given. As described above, some currently available receptor-ligand binding 
analyzing software takes flexibility of a ligand into account, but most of such software fails to consider flexibility of a 
receptor. Even though there is software that considers flexibility of a receptor, such consideration just moves a side 
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chain of the receptor by generation of random numbers, and most of software are dedicated to a Lock&Key type receptor. 
In such circumstances, we attempted to develop receptor-ligand binding analyzing software dedicated to an Induced- 
Fit type receptor. 

[0008] The problem to be solved by the present invention is to provide a ligand screening apparatus, a ligand screening 
5 method, a program and a recording medium capable of screening for a ligand that binds to a certain protein which is a 
particularly important key to development of agricultural chemicals and pharmaceuticals and the like, with significantly 
higher efficiency and accuracy than conventional methods. It is also an object to provide a ligand screening apparatus, 
a ligand screening method, a program and a recording medium which carry out various modifications of ligand molecule 
and modifications of protein such as receptor rapidly and effectively. It is also an object of the present invention to clarify 
10 a mode of interaction between ligand and protein and make the recognition mechanism of the interaction clear, thereby 
identifying a cause of disease, and promoting development of related drugs. 

DISCLOSURE OF INVENTION 

15 [0009] Inventors of the present invention diligently examined method of screening for a ligand that binds to a target 
protein when any 3D structure of such a protein is given, and finally established a ligand screening apparatus, a ligand 
screening method, a computer program and a recording medium as will be described later. 

[0010] Here used a procedure called "molecular dynamic method (MD)" which calculates dynamic structure of a 
molecule by sequentially solving dynamic equation based on the classical dynamics for each atom constituting the 

20 molecule. In otherwords, this procedure calculates dynamic behavior based on classic dynamics in each atom constituting 
a certain molecule. Therefore, if one can employ this procedure successfully, even when an induced-fit type receptor in 
a state that no ligand is captured is selected as an initial state, binding between the receptor and a ligand could be 
reconstructed. Since the MD calculation is based on the classic dynamics, it is necessary to impose certain degree of 
constraint to each atom. For this reason, in our developing procedure, normal mode vibration (hereinafter "normal mode") 

25 of a receptor is first analyzed to calculate fluctuation of dihedral angle of main chain of the receptor, and then MD 
calculation is conducted while imposing constraint on each atom based on the calculated fluctuation of dihedral angle 
of main chain. To be more specific, first, analysis and calculation of normal mode are conducted, and then fluctuation 
of dihedral angle of main chain in a steady state is calculated. Then by carrying out molecular dynamic calculation while 
imposing constraint on each atom based on the fluctuation, dynamic structure of the receptor is predicted more accurately. 

30 By using the dynamic structure obtained in the molecular dynamic calculation and an interaction function, receptor/ligand 
binding which is also applicable to an induced-fit type receptor is predicted with high accuracy. In brief, the present 
invention predicts receptor/ligand binding more realistically with high accuracy. Therefore, the present invention is very 
useful for designing pharmaceuticals and agricultural chemicals. 

[0011] To solve the objectives described above, a ligand screening apparatus according to a present invention is a 
35 ligand screening apparatus which screens for a ligand that binds to a protein when coordinate data of the protein of 
single chain or plural chains is given, the apparatus including: post-structural-change protein coordinate data selecting 
unit that effects structural change in consideration of dynamic behavior using induced-fit parameter reflecting induced 
fit on the coordinate data of protein and selects post-structural-change protein coordinate data; spatial point designating 
unit that designates a spatial point at which superposition with the ligand is to be conducted, from the post-structural- 
40 change protein coordinate data selected by the post-structural-change protein coordinate data selecting unit; interaction 
function calculating unit that calculates an interaction function when the protein and the ligand bind to each other using 
the spatial point designated by the spatial point designating unit and a ligand coordinate data of the ligand; and ligand 
evaluating unit that evaluates the ligand that binds to the protein based on the interaction function calculated by the 
interaction function calculating unit. 
45 [0012] The ligand screening apparatus according to another aspect of the present invention is the ligand screening 
apparatus, wherein the interaction function calculating unit calculates the interaction function using Score (i,j) shown in 
Formula 1. 
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(wherein is a distance between i-th spatial point and j-th spatial point in the target protein, dy 0 is an interatomic 
distance between i-th atom and j-th atom in the compound, a is a coefficient for making Sscore(iJ) the maximum value 
when the group of spatial points in the target protein and the compound completely overlap with each other. (3 is a 
coefficient for giving a threshold value by which it can be defined as "overlapping") 
5 [0013] The ligand screening apparatus according to another aspect of the present invention is the ligand screening 
apparatus, wherein the interaction function calculating unit further includes interaction function optimizing unit that carries 
out optimization so as to make the score of interaction function maximum. 

[0014] The ligand screening apparatus according to another aspect of the present invention is the ligand screening 
apparatus, wherein the interaction function calculating unit further includes: interaction energy optimizing unit that cal- 
10 culates interaction energy with the protein for the superposed ligand after optimization of the interaction function by the 
interaction function optimizing unit, and optimizes the interaction energy while finely adjusting conformation of ligand 3D 
structure data. 

[0015] The ligand screening apparatus according to another aspect of the present invention is the ligand screening 
apparatus, wherein the ligand evaluating unit further includes: reevaluating unit that executes the interaction function 
is calculating unit after largely changing conformation of ligand 3D structure data following optimization by the interaction 
energy optimizing unit, and reevaluates the ligand that binds to the protein based on the interaction function calculated 
by the interaction function calculating unit. 

[0016] The ligand screening apparatus according to another aspect of the present invention is the ligand screening 
apparatus, wherein in calculation of any one of the induced-fit parameter and the post-structural-change protein coor- 
20 dinate data or both, the post-structural-change protein coordinate data selecting unit calculates normal mode for the 
protein coordinate data, determines intensity of fluctuation of each amino acid, and conduct molecular dynamic calculation 
using the intensity of fluctuation as a constraint condition. 

[0017] The ligand screening apparatus according to another aspect of the present invention is the ligand screening 
apparatus, wherein the post-structural-change protein coordinate data selecting unit calculates a fluctuation value of 
25 dihedral angle of main chain according to normal mode calculation, and conducts molecular dynamic calculation while 
setting the fluctuation value as a coefficient of force K in the molecular dynamic calculation shown by Formula 2 or 
Formula 3. 



30 Erot = KTOt((j)-(t)0) 2 [Formula 2] 

(wherein Erot represents energy of dihedral angle of main chain atom in 3D structure of a protein. § represents dihedral 
angle of main chain atoms. <|>0 represents standard value of dihedral angle of main chain atoms. Here, when a value 
35 of Krot is large, <)> is constrained by (|>0.) 



EpoS = Kpos(r-rO) 2 [Formula 3] 

40 

(wherein Epos represents position energy of main chain atom in 3D structure of a protein, r represents coordinate of 
main chain atom. rO represents standard value of coordinate of main chain atom. Here, when a value of Kpos is large, 
r is constrained by rO.) 

[0018] The ligand screening apparatus according to another aspect of the present invention is the ligand screening 
45 apparatus, wherein the interaction function calculating unit uses the interaction function to which a dynamic property 
function representing dynamic property of protein is added as "elastic energy". 

[0019] The ligand screening apparatus according to another aspect of the present invention is the ligand screening 
apparatus, wherein the interaction function calculating unit adapts "U collision" as "elastic energy" which is a function 
shown by Formula 4 in consideration of local flexibility of protein. 

50 
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Ueomsion = X X<pO j) 
i=l j=l 

cp (i ? j) = K collision * (R collision (i, j) - R) 2 

[Formula 4] 

10 

(wherein M is number of atoms in an active site that prohibit collision, N is number of atoms of ligand. When interatomic 
distance R between an i-th atom of a main chain or a side chain with a little dynamic behavior in active site, and j-th 
atom in the ligand is not more than collision distance ,, Rcollision(i,j)", c|)(i,j) is calculated) 

[0020] The ligand screening apparatus according to another aspect of the present invention is the ligand screening 
15 apparatus, wherein the interaction function calculating unit uses the interaction function to which a normal mode analysis 
result or secondary structure determination result of the protein is added as a dynamic property function that represents 
dynamic property of protein. 

[0021] A ligand screening method according to the present invention is the ligand screening method which screens 
for a ligand that binds to a protein when coordinate data of the protein of single chain or plural chains is given, the method 

20 including: post-structural-change protein coordinate data selecting step that effects structural change in consideration 
of dynamic behavior using induced-fit parameter reflecting induced fit on the coordinate data of protein and selects post- 
structural-change protein coordinate data; spatial point designating step that designates a spatial point at which super- 
position with the ligand is to be conducted, from the post -structural-change protein coordinate data selected by the post- 
structural-change protein coordinate data selecting step; interaction function calculating step that calculates an interaction 

25 function when the protein and the ligand bind to each other using the spatial point designated by the spatial point 
designating step and a ligand coordinate data of the ligand; and ligand evaluating step that evaluates the ligand that 
binds to the protein based on the interaction function calculated by the interaction function calculating step. 
[0022] The ligand screening method according to another aspect of the present invention is the ligand screening 
method, wherein the interaction function calculating step calculates the interaction function using Score (i,j) shown in 

30 Formula 1. 
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40 



(wherein djj s is a distance between i-th spatial point and j-th spatial point in the target protein. djj c is an interatomic 
distance between i-th atom and j-th atom in the compound, a is a coefficient for making Sscore(iJ) the maximum value 
45 when the group of spatial points in the target protein and the compound completely overlap with each other, p is a 
coefficient for giving a threshold value by which it can be defined as "overlapping") 

[0023] The ligand screening method according to another aspect of the present invention is the ligand screening 
method, wherein the interaction function calculating step further includes interaction function optimizing step that carries 
out optimization so as to make the score of interaction function maximum. 
50 [0024] The ligand screening method according to another aspect of the present invention is the ligand screening 
method, wherein the interaction function calculating step further includes: interaction energy optimizing step that calcu- 
lates interaction energy with the protein for the superposed ligand after optimization of the interaction function by the 
interaction function optimizing step, and optimizes the interaction energy while finely adjusting conformation of ligand 
3D structure data. 

55 [0025] The ligand screening method according to another aspect of the present invention is the ligand screening 
method, wherein the ligand evaluating step further includes: reevaluating step that executes the interaction function 
calculating step after largely changing conformation of ligand 3D structure data following optimization by the interaction 
energy optimizing step, and reevaluates the ligand that binds to the protein based on the interaction function calculated 
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by the interaction function calculating step. 

[0026] The ligand screening method according to another aspect of the present invention is the ligand screening 
method, wherein in calculation of any one of the induced-fit parameter and the post-structural-change protein coordinate 
data or both, the post-structural-change protein coordinate data selecting step calculates normal mode for the protein 
5 coordinate data, determines intensity of fluctuation of each amino acid, and conduct molecular dynamic calculation using 
the intensity of fluctuation as a constraint condition. 

[0027] The ligand screening method according to another aspect of the present invention is the ligand screening 
method, wherein the post-structural-change protein coordinate data selecting step calculates a fluctuation value of 
dihedral angle of main chain according to normal mode calculation, and conducts molecular dynamic calculation while 
10 setting the fluctuation value as a coefficient of force K in the molecular dynamic calculation shown by Formula 2 or 
Formula 3. 



20 



Erot = Krot((|)-^0) 2 [Formula 2] 

(wherein Erot represents energy of dihedral angle of main chain atom in 3D structure of a protein, represents dihedral 
angle of main chain atom. tyO represents standard value of dihedral angle of main chain atom. Here, when a value of 
Krot is large, <J> is constrained by <>0.) 

EpOS = Kpos(r-rO) 2 [Formula 3] 



25 (wherein Epos represents position energy of main chain atom in 3D structure of a protein, r represents coordinate of 
main chain atom. rO represents standard value of coordinate of main chain atom. Here, when a value of Kpos is large, 
r is constrained by rO.) 

[0028] The ligand screening method according to another aspect of the present invention is the ligand screening 
method, wherein the interaction function calculating step uses the interaction function to which a dynamic property 
30 function representing dynamic property of protein is added as "elastic energy". 

[0029] The ligand screening method according to another aspect of the present invention is the ligand screening 
method, wherein the interaction function calculating step adapts "U collision" as "elastic energy" which is a function 
shown by Formula 4 in consideration of local flexibility of protein. 



U c0lIisi0 n =£|>(ij) 

4o cp (i, j) = K collision * (R collision (i j) - R) 2 

[Formula 4] 

(wherein M is number of atoms in an active site that prohibit collision, N is number of atoms of ligand. When interatomic 
45 distance R between an i-th atom of a main chain or a side chain with a little dynamic behavior in an active site, and j- 
th atom in the ligand is not more than collision distance "Rcollision(ij)", <|>(i,j) is calculated) 

[0030] The ligand screening method according to another aspect of the present invention is the ligand screening 
method, wherein the interaction function calculating step uses the interaction function to which a normal mode analysis 
result or secondary structure determination result of the protein is added as a dynamic property function that represents 

so dynamic property of protein. 

[0031] A program according to the present invention is a program which makes a computer execute a ligand screening 
method which screens for a ligand that binds to a protein when coordinate data of the protein of single chain or plural 
chains is given, the method including: post-structural-change protein coordinate dataselecting step that effects structural 
change in consideration of dynamic behavior using induced-fit parameter reflecting induced fit on the coordinate data 

55 of protein and selects post -structural-change protein coordinate data; spatial point designating step that designates a 
spatial point at which superposition with the ligand is to be conducted, from the post-structural-change protein coordinate 
data selected by the post-structural-change protein coordinate data selecting step; interaction function calculating step 
that calculates an interaction function when the protein and the ligand bind to each other using the spatial point designated 
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by the spatial point designating step and a ligand coordinate data of the ligand; and ligand evaluating step that evaluates 
the ligand that binds to the protein based on the interaction function calculated by the interaction function calculating step. 
[0032] The program according to another aspect of the present invention is the program, wherein the interaction 
function calculating step calculates the interaction function using Score (i,j) shown in Formula 1. 

5 



Sscore{i,j) = ^ 

V 



15 

(wherein djj s is a distance between i-th spatial point and j-th spatial point in the target protein. 6 y f is an interatomic 
distance between i-th atom and j-th atom in the compound, a is a coefficient for making Sscore(iJ) the maximum value 
when the group of spatial points in the target protein and the compound completely overlap with each other, p is a 
20 coefficient for giving a threshold value by which it can be defined as "overlapping") 

[0033] The program according to another aspect of the present invention is the program, wherein the interaction 
function calculating step further includes interaction function optimizing step that carries out optimization so as to make 
the score of interaction function maximum. 

[0034] The program according to another aspect of the present invention is the program, wherein the interaction 
25 function calculating step further includes: interaction energy optimizing step that calculates interaction energy with the 
protein for the superposed ligand after optimization of the interaction function by the interaction function optimizing step, 
and optimizes the interaction energy while finely adjusting conformation of ligand 3D structure data. 
[0035] The program according to another aspect of the present invention is the program, wherein the ligand evaluating 
step further includes: reevaluating step that executes the interaction function calculating step after largely changing 
30 conformation of ligand 3D structure data following optimization by the interaction energy optimizing step, and reevaluates 
the ligand that binds to the protein based on the interaction function calculated by the interaction function calculating step. 
[0036] The program according to another aspect of the present invention is the program, wherein in calculation of any 
one of the induced-fit parameter and the post-structural-change protein coordinate data or both, the post-structural- 
change protein coordinate data selecting step calculates normal mode for the protein coordinate data, determines 
35 intensity of fluctuation of each amino acid, and conduct molecular dynamic calculation using the intensity of fluctuation 
as a constraint condition. 

[0037] The program according to another aspect of the present invention is the program, wherein the post-structural- 
change protein coordinate data selecting step calculates a fluctuation value of dihedral angle of main chain according 
to normal mode calculation, and conducts molecular dynamic calculation while setting the fluctuation value as a coefficient 
40 of force K in the molecular dynamic calculation shown by Formula 2 or Formula 3. 



when i is not equal to j 
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50 



Erot = Krot((|)-<|)0) 2 [Formula 2] 

(wherein Erot represents energy of dihedral angle of main chain atom in 3D structure of a protein, represents dihedral 
angle of main chain atom. (j>0 represents standard value of dihedral angle of main chain atom. Here, when a value of 
Krot is large, <> is constrained by (J>0.) 

Epos = Kpos(r-r0) 2 [Formula 3] 



(wherein Epos represents position energy of main chain atom in 3D structure of a protein, r represents coordinate of 
55 main chain atom. rO represents standard value of coordinate of main chain atom. Here, when a value of Kpos is large, 
r is constrained by rO.) 

[0038] The program according to another aspect of the present invention is the program, wherein the interaction 
function calculating step uses the interaction function to which a dynamic property function representing dynamic property 
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of protein is added as "elastic energy". 

[0039] The program according to another aspect of the present invention is the program, wherein the interaction 
function calculating step adapts "U collision" as "elastic energy" which is a function shown by Formula 4 in consideration 
of local flexibility of protein. 



M N 

u C0 n isl0n = 21>(ij) 

cp(ij) = K collision *(R collision (ij)-R) 2 

[Formula 4] 

15 

(wherein M is number of atoms in an active site that prohibit collision, N is number of atoms of ligand. When interatomic 
distance R between an i-th atom of a main chain or a side chain with a little dynamic behavior in an active site, and j- 
th atom in the ligand is not more than collision distance "Rcollision(ij)", (|>(i,j) is calculated) 

[0040] The program according to another aspect of the present invention is the program, wherein the interaction 
20 function calculating step uses the interaction function to which a normal mode analysis result or secondary structure 
determination result of the protein is added as a dynamic property function that represents dynamic property of protein. 
[0041] A computer readable recording medium according to the present invention is the computer readable recording 
medium in which the program according to the present invention is recorded. 

25 BRIEF DESCRIPTION OF DRAWINGS 

[0042] 

FIG. 1 is a conceptual view showing a basic principle of the present invention; 
30 FIG. 2 is a view showing a dummy hydrogen atom in sp 2 orbital; 

FIG. 3 is a view showing a dummy atom in a metal atom; 

FIG. 4 is a view showing an initial coordinate (B) for docking a ligand into an active site based on structure-activity 
relationship (SAR) information; 

FIG. 5 is a view showing an initial coordinate (B) for docking a ligand into an active site based on structure -activity 
35 relationship (SAR) information; 

FIG. 6 is a view showing an initial coordinate (B) for docking a ligand into an active site based on structure-activity 
relationship (SAR) information; 

FIG. 7 is a view showing an initial coordinate (B) for docking a ligand into an active site based on structure -activity 
relationship (SAR) information; 

40 FIG. 8 is a view showing an initial coordinate (B) for docking a ligand into an active site based on structure -activity 

relationship (SAR) information; 

FIG. 9 is an illustrative view of definition of angle of hydrogen bond; 
FIG. 10 is an illustrative view of definition of angle in stacking; 
FIG. 11 is a view showing a result of normal mode analysis in MODEL 1 of 1 LUD; 
45 FIG. 12 is a view showing a result of comparison between parameters of MD and clustering and scores; 

FIG. 13 is a view showing distribution of maximum value and minimum value of dihedral angle constraint in MD at 

a fixed clustering coefficient; 

FIG. 14 is a view showing constraint parameter; 

FIG. 15 is a view showing distribution of dihedral angle constraint parameters at a fixed clustering parameter; 
50 FIG. 1 6 is a view showing distribution of dihedral angle constraint parameters at a fixed clustering parameter; 

FIG. 17 is a view showing distribution of dihedral angle constraint parameters at a fixed clustering parameter; 
FIG. 18 shows distribution of dihedral angle constraint parameters at a fixed clustering parameter; 
FIG. 19 is a view showing a result of MD of 1 LUD (MODEL 1); 

FIG. 20 is a view showing a result of comparison with a NMR structure when the MD is calculated without constraint 
55 of dihedral angle; 

FIG. 21 is a view showing a result of comparison with a NMR structure when the MD is calculated with constraint 
of dihedral angle; 

FIG. 22 is a view showing alignment of 1 CBQ; 
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FIG. 


80 


is 


a view showing 



1 3D structure of 1CBQ; 
I 3D structure of 1CBQ; 

I difference between an X-ray structure and a model structure of 1CBQ by rms; 
! results of normal mode analysis for an X-ray structure and a model structure of 1CBQ; 
I results of normal mode analysis for an X-ray structure and a model structure of 1CBQ; 
I results of MD calculation for an X-ray structure and a model structure of 1CBQ; 
I alignment of 1J9G; 
i 3D structure of 1J9G; 
I 3D structure of 1J9G; 

i difference between an X-ray structure and a model structure of 1J9G by rms; 
! results of normal mode analysis for an X-ray structure and a model structure of 1J9G; 
! results of normal mode analysis for an X-ray structure and a model structure of 1J9G; 
! results of MD calculation for an X-ray structure and a model structure of 1J9G; 
! alignment of 1 MMB; 
I 3D structure of 1 MMB; 
I 3D structure of 1 MMB; 

i difference between an X-ray structure and a model structure of 1 MMB by rms; 
! results of normal mode analysis for an X-ray structure and a model structure of 1J9G; 
l results of normal mode analysis for an X-ray structure and a model structure of 1J9G; 
I results of MD calculation for an X-ray structure and a model structure of 1MMB; 
I 3D structure of dihydrofolate reductase; 
j a result of normal mode analysis of 1 BZF (MODEL 1 8); 
I a result of MD calculation of 1 BZF (MODEL 1 8); 

I structure-activity relationship information obtained from 1LUD (MODEL4); 
I a result of active site-ligand binding analysis in 1BZF (MODEL 18); 
I 1 BZF (MODEL4)-ligand binding; 
I 1 BZF (MODEL4)-ligand binding; 
I 1 BZF (MODEL4)-ligand binding; 
I 3D structure of heat shock protein 90; 
I a result of normal mode analysis of 1YER; 
I a result of MD calculation of 1 YER; 

I structure -activity relationship information obtained from 1YER; 
I a result of active site-ligand binding analysis in 1 YER; 
1 1YER-ligand binding; 
! 1 YER-ligand binding; 

I 3D structure of mitogen-activated protein kinase; 
i a result of normal mode analysis of 1A9U; 
! a result of MD calculation of 1 A9U; 

I structure -activity relationship information obtained from 10UK; 
I a result of active site-ligand binding analysis in 1 A9U; 
I 1 A9U-ligand binding; 
! 1 A9U-ligand binding; 
I 1 A9U-ligand binding; 
i 3D structure of 1 AIX; 
! a result of in silico screening; 
! a protein/ligand complex structure; 
I a protein/ligand complex structure; 
! a protein/ligand complex structure; 
I a protein/ligand complex structure; 
i a protein/ligand complex structure; 
i a protein/ligand complex structure; 
! a protein/ligand complex structure; 
! a protein/ligand complex structure; 
I 3D structure of SARS protease; 
i a result of normal mode analysis of 1 UK3 (B chain); 
! a result of MD calculation of 1 UK3 (B chain); 

I structure -activity relationship information obtained from 1UK4 (B chain); 
i a result of in silico screening in 1 UK3 (B chain); 
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FIG. 81 is a view showing a result of comparison between 1 UK3 and 1 UK4; 
FIG. 82 is a view showing Ranking 1 of in silico screening; 
FIG. 83 is a view showing Ranking 1 of in silico screening; 

FIG. 84 is a view showing structure -activity relationship information obtained from 1 UK3 (B chain); 
5 FIG. 85 is a view showing a result of comparison between 1 UK3 (B chain) and 1 UK4 (B chain); 

FIG. 86 is a view showing a result of in silico screening executed while designating three positions in SAR; 

FIG. 87 is a view showing structure-activity relationship information obtained from 1 UK3 (B chain); 

FIG. 88 is a view showing a result of high throughput screening executed while designating five positions in SAR; 

FIG. 89 is a view showing a result of comparison between 1 UK3 (B chain) and 1 UK4 (B chain); 
10 FIG. 90 is a view showing structure-activity relationship information obtained from 1 UK3 (B chain); 

FIG. 91 is a view showing a result of high throughput screening executed while changing the designated ligand 

atom type; 

FIG. 92 is a view showing a result of comparison between 1 UK3 (B chain) and 1 UK4 (B chain); 
FIG. 93 is a view showing structure-activity relationship information obtained from 1 UK4 (B chain); 
is FIG. 94 is a view showing a result of high throughput screening executed with fixed receptor; 

FIG. 95 is a view showing a result of comparison between a ligand in which 1 UK3 and 1 UK4 are superposed, and 
a ligand of screening result; 

FIG. 96 is a view showing distribution of dihedral angle constraint MD parameters in 1 AXJ; 
FIG. 97 is a view showing a result of MD calculation of 1 LUD (MODEL 1 ); 
20 FIG. 98 is a view showing a result of MD calculation of 1 LUD (MODEL 1); 

FIG. 99 is a view showing a result of MD calculation of 1 LUD (MODEL 1 ); 
FIG. 1 00 is a view shov 
FIG. 101 is a view shov 
FIG. 1 02 is a view shov 
25 FIG. 1 03 is a view shov 

FIG. 1 04 is a view shov 
FIG. 1 05 is a view shov 
FIG. 106 is a view shov 
FIG. 107 is a view shov 
30 FIG. 1 08 is a view shov 

FIG. 1 09 is a view shov 
FIG. 11 0 is a view shov 
FIG. 111 is a view shov 
FIG. 1 12 is a view showing structure-activity relationship information modified for 1 BZF; 
35 FIG. 1 1 3 is a view showing a result of ligand binding analysis of 1 BZF (MODEL 1 8); 

FIG. 1 1 4 is a view showing a result of ligand binding analysis of 1 BZF (MODEL 1 8); 

FIG. 1 1 5 is a flowchart showing one example of main process in the present system in the present embodiment; 
FIG. 116 is a block diagram showing one example of configuration of the present system to which the present 
invention is applied; 

40 FIG. 1 17 is a block diagram showing one example of configuration of an interaction function calculating unit 102c 

of the present system to which the present invention is applied; and 

FIG. 1 1 8 is a block diagram showing one example of configuration of a ligand evaluating unit 1 02d of the present 
system to which the present invention is applied. 

45 BEST MODE(S) FOR CARRYING OUT THE INVENTION 

[0043] Exemplary embodiments of a ligand screening apparatus, a ligand screening method, a program and a recording 
medium of the present invention will be explained in detail with reference to the attached drawings. It is to be noted that 
the present invention is not limited to these exemplary embodiments. 

50 [0044] Several terms used herein have the following meanings unless otherwise specified. 

[0045] The term "target protein" means a protein whose precise 3D structure is already determined by X-ray crystal- 
lographic analysis, NMR analysis or homology modeling method and which is an object of ligand screening. 
[0046] The term "atomic coordinate" describes 3D structure in 3D space. It includes relative distances from the origin 
of a certain spatial point in three directions which are perpendicular to each other, and hence an atomic coordinate is 

55 described by a vector comprising three numbers per each atom existing in a protein except for hydrogen atoms. 
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[Basic principal of the present invention] 

[0047] The basic principal of the present invention will now be explained with reference to FIG. 1. FIG. 1 is a conceptual 
diagram showing the basic principal of the present invention. The present invention relates to a ligand screening appa- 

5 ratus, a ligand screening method, a computer program and a recording medium, wherein when a protein 3D structure 
consisting of a single chain or plural chains is given, a parameter reflecting induced-fit from the given 3D structure of 
the target protein and a 3D structure coordinate after structural change are calculated in advance by e.g., normal mode 
calculating method or molecular dynamic calculating method; an interaction function in binding of the target protein and 
other substance (ligand) is defined using the parameter and the 3D structure coordinate after structural change; and a 

10 substance (ligand) which binds to the target protein is evaluated and chosen by the interaction function. 

[0048] In the present invention, first, one ligand is selected from a compound database, and 3D structure data of the 
ligand is acquired (Step S-1). Also 3D structure data of a target protein is acquired (Step S-2). 

[0049] Then in the present invention, based on the 3D structure data of the protein, structural change considering 
dynamic behavior is conducted using an induced-fit parameter reflecting induced-fit, to prepare plural sets of protein 
is coordinate data after structural change (hereinafter referred to as "post-structural-change protein coordinate data"), and 
one set of post-structural-change protein coordinate data is randomly chosen (Step S-3). 

[0050] Next, in the present invention, a spatial point in the post-structural-change protein coordinate data to which the 
ligand is to be superposed is designated (Step S-4). The spatial point may be designated, for example, by the methods 
(1) and (2) as will be described below. 

20 

(1 ) Designation of spatial point by generation of dummy atom (Step S-4-1 ) 

[0051] Focusing on a hydrogen bond in interaction between ligand and protein, a hydrogen bond site in the protein is 
designated as a spatial point. The important issues in hydrogen bond are distance and angle. That is, a hydrogen bond 
25 donor (hereinafter referred to as "donor") is required for calculating an angle. 

[0052] In the present invention, when there is no hydrogen atom in an active site and the ligand, a dummy hydrogen 
atom is generated in accordance with the following rule. 

1) A dummy atom is generated in an equilateral triangle centered at sp 2 orbital atom (FIG. 2). More specifically, as 
30 shown in FIG. 2, a dummy hydrogen atom (B) is generated at a free position in the equilateral triangle centered at 

the nitrogen atom (A) of sp 2 orbital atom. 

2) As to a sp 3 orbital atom, when it is at a distance to form a hydrogen bond, it is deemed to turn so as to share the 
hydrogen atom, so that only the distance is considered in calculation of hydrogen bond interaction. Therefore, no 
dummy atom is generated for the sp 3 orbital atom. 

35 

[0053] As to metal and water, since they can mediate binding between active site and ligand binding, a dummy atom 
is generated at an interacting position in the manner as described below. 

1) To metal such as iron, dummy atoms are generated in a regular octahedron (FIG. 3). That is, as shown in FIG. 
40 3 } dummy atoms (B) are generated at free positions in the regular octahedron centered at zinc (A). 

2) To water, dummy atoms are generated in a regular tetrahedral. It is not necessary to generate a dummy atom in 
the direction in which it interacts with an active site. 

(2) Designation of spatial point using structure-activity relationship information (Step S-4-2) 

45 

[0054] Also in the present invention, a spatial point is designated by inputting information of the following items 
(A) to (D) in focus of structure -activity relationship (SAR) information of ligand. 

(A) Atom of active site obtained form SAR information (hereinafter referred to as "A atom"). It follows PDB format. 
50 (B) Atom type of ligand which is expected to interact with "A atom" (hereinafter referred as "B type"). It follows MOL2 

format of SYBYL. 

(C) Intensity of interaction between "A atom" and "B type" (hereinafter referred to as "C intensity"). 

(D) Distance of interaction between "A atom" and "B type" (hereinafter referred to as "D distance") (unit: angstrom). 

55 [0055] In the present invention, a spatial point may be created according to the rules shown 1) to 4) below based on 
the input information (A) to (D) and using an initial coordinate of ligand in an active site of protein. 

1) When "A atom" is donor or metal and water (when designation of SAR information on the active side end is 
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hydrogen bond donor or metal atom), a point and a circumference at "D distance" from "A atom" with respect to the 
direction of dummy atoms generated in Step S-4-1 are selected as initial coordinates (FIG. 4 and FIG. 5). 

2) When "A atom" is a sp 3 orbital atom (when designation of SAR information on the active site end is a sp 3 orbital 
atom), a circumference at "D distance" from "A atom" is selected as initial coordinate (FIG. 6). 

3) When "A atom" is a hydrogen bond acceptor (hereinafter referred to as "acceptor") (when designation of SAR 
information on the active site end is a hydrogen bond acceptor), a position and a circumference at "D distance" on 
a bonding extension of "A atom" are selected as initial coordinates (FIG. 7). 

4) In other cases (when designation of SAR information on the active site end is an atom other than the above), a 
point on the surface of a sphere centered at "A atom" and having a radius of "D distance" is selected as an initial 
coordinate (FIG. 8). 

[0056] Here in contrast with the above 1 ) to 4), an initial coordinate of ligand may be designated directly. 
[0057] Returning again to FIG. 1 , in the present invention, pairs of an atom in the ligand coordinate data selected in 
Step S-1 and a spatial point in the protein coordinate data designated in Step S-4 are randomly selected so that they 
are not overlapped with each other (Step S-5). 

[0058] Then in the present invention, a score Sscore(iJ) which is an interaction function represented by the following 
formula 1 is calculated (Step S-6). 



[0059] Here, djj s is a distance between i-th spatial point and j-th spatial point in target protein. djj c is an interatomic 
distance between i-th atom and j-th atom in the compound, a is a coefficient for making Sscore(iJ) the maximum value 
when the group of spatial points in the target protein and the compound completely overlap with each other, p is a 
coefficient for giving a threshold value by which it can be defined as "overlapping". 
[0060] Preferably, a is 1 .5 and p is 0.8. 

[0061] Next, in the present invention, adjustment (optimization) is conducted so thatthe score of the interaction function 
determined in Step S-6 is maximum (Step S-7). Here, as the procedure for maximizing the score, a simulated annealing 
method is exemplified. For reducing the time, it is preferred to employ a process in which Step S-5 and Step S-6 are 
repeated plural times (for example, 1 000 times) to find a pair in which Sscore(iJ) is maximum, and a ligand is superposed 
to an initial coordinate based on information of the found pair. 

[0062] Next, in the present invention, interaction energy with respect to the protein is calculated for the ligand which 
is superposed in optimization of the interaction function in Step S-7, and the resultant interaction energy is optimized 
while finely adjusting the conformation of the ligand coordinate data (Step S-8). The fine adjustment of ligand conformation 
may be conducted by translating, rotating or changing coordinates in such an extent that the angle around a single bond 
will not exceed 0.3 by RSMD, about the ligand coordinate calculated in Step S-7. 

[0063] Here, the fine adjustment of conformation of ligand coordinate data is preferably optimized by random search, 
for example. In random search, minimal changes between active site of protein and ligand are repeated, for example, 
8000 times in accordance with the items 1) to 3) below, to make an optimum energy "U optimum" is minimum. 

1) Up to five bonds are selected at random from rotatable bonds, and each bond is randomly rotated within the 
range of ± 1 0.0 degrees for changing the conformation of ligand. This process is effected, for example every three 
times. 

2) In each of x, y and z axial directions, ligand is randomly translated within the range of ± 1 .0 angstrom. This process 
is effected, for example every twice. 

3) In each rotation center coordinate, a coordinate of the rotation center is randomly moved within the range of ±1 .0 
angstrom, and the ligand is randomly rotated within the range of ±5.0 degrees with respect to each of direction of 
three orthogonal axes. This process is effected, for example every five times. 

[0064] Next, in the present invention, conformation of ligand coordinate data is largely changed, and then the process 



when i is not equal to j 




when i is equal to j 
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from Step S-5 is started again and the process up to Step S-8 for optimization is repeated (Step S-9). Modification of 
conformation maybe conducted by translating, rotating or changing coordinates in such an extent that the angle around 
a single bond will be equal to or more than 0.3 by RSMD, about the ligand coordinate calculated in STEP S-7. 
[0065] Here, optimization by largely moving conformation of ligand is achieved by selecting five rotatable bonds at 
5 random, for example, with respect to the conformation in the "U optimized" which is energy optimized in Step S-8, and 
rotating them randomly in accordance with a rotation angle interval determined for each atom type. Then the process 
after Step S-5 is repeated, for example, 5000 times. 

[0066] After changing conformation of ligand, internal energy of the ligand "U internal" is calculated, and if the value 
is 500.0 or more, a subsequent calculation may be skipped, and the next ligand conformation may be caused to generate. 
10 Next, in the present invention, the process from Step S-4 to Step S-9 is conducted for the plural sets of post-structural- 
change protein coordinate data prepared in Step S-3, and an optimum coordinate of protein-ligand complex, and optimum 
energy "U optimum" are calculated (Step S-10). 

[0067] Next, in the present invention, the above process is conducted for every ligand in the compound database 
prepared in Step S-1 , and a ligand which possibly binds to the target protein is selected from the compound database 
15 (Step S-1 1). 

[0068] In the above, a basic principal of the present invention was described. In the present invention, however, when 
any one of a parameter reflecting induced-fit of protein and post-structural-change 3D structure coordinate or both are 
calculated using molecular dynamic calculation method, normal mode with respect to 3D structure of the target protein 
may be calculated; fluctuations of respective amino acids may be determined; molecular dynamic calculation may be 

20 conducted while using the intensity of the fluctuation as a constraint condition; and thereby molecular dynamic calculation 
may be conducted so that 3D structure of protein will not largely differ from the energy optimized structure. 
[0069] In the present invention, the molecular dynamic calculation according to the present molecular dynamic cal- 
culation method may be so conducted that, for example, a fluctuation value of dihedral angle of main chain atom is 
calculated from the normal mode calculation, and the fluctuation value is substituted into coefficient of force K in the 

25 molecular dynamic calculation as shown in Formula 2 or Formula 3. 



35 



Erot = Krot (()> - §ti) 2 [Formula 2] 

[0070] Erot represents energy of dihedral angle of main chain atom in 3D structure of a protein. § represents dihedral 
angle of main chain atom. (j>0 represents standard value of dihedral angle of main chain atom. Here, when a value of 
Krot is large, § is constrained by (J>0. 

Epos = Kpos (r - rO) 2 [Formula 3] 



[0071] Epos represents position energy of main chain atom in 3D structure of a protein, r represents coordinate of 
main chain atom. rO represents standard value of coordinate of main chain atom. Here, when a value of Kpos is large, 
40 r is constrained by rO. 

[0072] In the present invention, as a target function (interaction function) in evaluation of interaction between ligand 
and protein, a dynamic property function that expresses dynamic properties of protein may be added to the conventional 
interaction energy function as "Elastic energy". As a result, it is possible to rapidly calculate interaction energy from 3D 
structure coordinate of protein, and to clearly describe physicochemical property regarding dynamic behavior of the 
45 protein. 

[0073] Here, as the "elastic energy", the function "U collision" shown by Formula 4 below may be adapted in consid- 
eration of local flexibility of protein. In the following example, when interatomic distance R between an i-th atom of a 
main chain or a side chain with a little dynamic behavior in active site, and a j-th atom in a ligand is not more than collision 
distance "Rcollision(iJ)", calculation of (|>(i,j) is defined as follows. 

50 

^collision = f Z<P(i/j) 



55 



<p (i, j) = K collision * (r collision(i r j) - r) 2 



[0074] M is number of atoms in active site that prohibit collision, N is number of atoms of ligand. "K collision" is preferably 
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1 000.0. "Rcollision(i j)" is a sum of van der Waals radii of i-th atom in the active site and j-th atom in the ligand. 
[0075] Here, with respect to each atom in the active site, when weighing w(i) that allows collision is defined, the function 
"U collision" shown by the following Formula 5 is used. w(i) is real number ranging from 0 to 1 . 

5 

^ collision = Z Z CP (i, j) 
J=l 7=1 

cp (i, j) = w (i) * K collision * (r collision (i, j) - R) 2 

10 

[Formula 5] 

[0076] M is number of atoms in active site, N is number of atoms in the ligand. "K collision" is preferably 1000.0. 
" Rcollision(iJ)" is a sum of van der Waals radii of i-th atom in the active site and j-th atom in the ligand. 
15 [0077] The "elastic energy" may be defined by using functions shown by Formula 6 below. 



Ev = w (hard shape region), 

. [Formula 6] 

20 E - 0 (soft shape region) 

[0078] Here, "hard shape region" means a part exhibiting small dynamic behavior in 3D structure of protein, and "soft 
shape region" means a part exhibiting large dynamic behavior. Preferably, W is a constant and 100. 
[0079] In the present invention, as a function of dynamic property of protein, a result of normal mode analysis of the 
25 protein or a result of secondary structure determination may be used. In determination of secondary structure, small 
fluctuation for helix or sheet regions of protein, and large fluctuation for other regions are applied to the constraint 
condition in interaction evaluation function and molecular dynamic calculation. 

[0080] Also, according to the present invention, every plural coordinates can be evaluated equally, in short time full- 
automatically in screening of a ligand that binds to a target protein, even when the calculated target protein (protein 
30 coordinate data) includes typical plural 3D structural coordinates, or 3D structure of the target protein consists of plural 
3D structure coordinates such as analytical result of NMR spectrum. 

[0081] Also, according to the present invention, (1 ) coordinate data of protein is subjected to structural change while 
dynamic behavior is considered by using an induced-fit parameter reflecting induced-fit, and post-structural-change 
protein coordinate data is selected; (2) from the selected post-structural-change coordinate data of protein, a spatial 

35 point at which superposition with ligand is to be executed is designated; (3) an interaction function when the protein and 
the ligand bind is calculated using the designated spatial point and ligand coordinate data of a ligand; and (4) a ligand 
which binds to the protein is evaluated based on the calculated interaction function. This is advantageous to screen for 
a ligand that binds to an induced-fit type receptor protein with high efficiency and accuracy while considering flexibility 
of receptor and flexibility of ligand. 

40 [0082] In addition, according to the present invention, it is possible to predict a new ligand that binds to 3D structure 
of a target protein by acquiring a parameter reflecting dynamic behavior of the protein which is very important for binding 
between protein and ligand, and using a novel interaction evaluation function in relation to a ligand, which reflects dynamic 
behavior of the target protein. As a result, in contrast to conventional methods, it is possible to construct 3D structures 
of proteins that are more reliable and suitable for design of pharmaceuticals and the like at a speed enough to keep up 

45 with enormous genomic sequences that are globally analyzed. Conventionally, in silico screening, an algorithm capable 
of satisfactorily handling the induced binding that is important for interaction between protein and ligand has not been 
established, however, in the present invention, by employing a calculation formula which allows simple inclusion of a 
parameter representing "fluctuation" of protein obtainable from a normal mode calculation result or secondary structure 
prediction, into an interaction energy function between protein and ligand, it is possible to satisfactorily handle induced 

50 binding. 

[0083] Further, in molecular dynamic simulation, the present invention is featured by conducting normal mode calcu- 
lation of a target protein with regard to a parameter reflecting dynamic behavior of the target protein and to an interaction 
evaluation function in relation to ligand, and reflecting the calculation result into molecular dynamic calculation. Conven- 
tionally, molecular dynamic calculation is usedtosimulate dynamic behavior or protein, however, whenmoleculardynamic 
55 calculation is conducted on a target protein by a conventional method, the protein 3D structure will greatly differ from 
the coordinate that is analyzed by X-ray analysis, NMR and the like. Such difference includes physicochemical description 
for dynamic behavior of the protein, however, the behavior is sometimes contradictory to the experimental result of 
dynamic behavior proved by NMR or the like, so that the simulation is not necessary accurate. For this reason, in 
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conducting molecular dynamic calculation, it is necessary to conduct simulation while fixing 3D structure of protein to 
some extent, and in the present invention, we developed a measure for constraining dihedral angle of a main chain atom 
in energy function in molecular dynamic calculation. Further, as a constraint condition of dihedral angle, normal mode 
calculation of a target protein is conducted in advance for its parameter, and fluctuation of dihedral angle of main chain 

5 atom is calculated. The fluctuation is used as a parameter in such a manner that according to the intensity of the 
fluctuation, the constraint condition is relaxed for a region exhibiting large fluctuation, and the constraint condition is 
intensified for a region exhibiting small fluctuation. Therefore, according to the present invention, it is possible to describe 
dynamic behavior with high accuracy by conducting molecular dynamic simulation of protein under such a condition. 
Additionally, it is possible to acquire a coordinate describing dynamic behavior of protein from the molecular simulation 

10 thus calculated, and by using this, it is possible to conduct ligand screening using various shapes of ligand binding sites. 
[0084] As a result of the above, the present invention enables new ligands to be found that would not be found by 
conventional in silico screening, and enables execution of analysis of protein-ligand interaction including"induced binding" 
that is enabled only by time-consuming molecular simulation heretofore, in short time. 

[0085] The present invention is applicable to "in silico screening" taking induced binding phenomenon into account 
15 more intensively than existent software, and achieves simplification based on correct understandings of induced binding 

phenomenon and hydrophobic interaction. Since the present invention is simplified, it allows handling of more target 

proteins by automation. As a result, it is possible to screen new possible compounds, for example, from more than a 

million of compound databases, and it is possible to screen possible compounds within a realistic time from a scale of 

databases that cannot be experimentally handled. 
20 [0086] Further, since the present invention enables interaction analysis between protein and ligand to be conducted 

in a short time, interaction between many proteins causes for example, metabolism andtoxity, and drugs can be analyzed, 

and thus prediction of in silico metabolism and toxity of drug is enabled. 

[0087] Molecules that can be handled as ligands in the present invention are understood as any substances including 
proteins, peptides, DNAs, drug ingredients, metals, ions, sugars, nucleic acid components and hormones because used 
25 number and kinds of ligands are not limited. The present invention enables specific molecular designing of agricultural 
chemicals, pharmaceuticals and the like. 

[0088] In a function for evaluating interaction energy between ligand and protein, conventionally, electrostatic energy 
term and van der Waals term in a docking method, and an adjustment term for expressing dynamic behavior used in a 
soft docking method are mainly used, however, in the present invention, instead of using an adjustmentterm for expressing 
30 dynamic behavior used in a soft docking method or the like, a principal of elastic collision which is used in classical 
mechanics is applied to interaction between protein and ligand, and thus physicochemical property of interaction between 
protein and ligand is more clarified. This provides relationship between structural change of protein and interaction, and 
helps rapid and correct understanding of function of a ligand. 

[0089] 3D structure of protein used in the present invention may be adapted to a 3D structure coordinate created by 
35 using an empirical modeling method (in particular, homology modeling method and threading method) of protein besides 
3D structure of protein whose 3D coordinate is determined by X-ray crystalline structure analysis or the like. 

[System configuration] 

40 [0090] Now, configuration of the present system to which the present invention is applied will be explained in detail 
with reference to FIG. 1 1 6. FIG. 1 1 6 is a block diagram showing one example of configuration of the present system to 
which the present invention is applied, and only the parts in configuration that are relevant to the present invention are 
schematically shown. 

[0091] As shown in FIG. 116, the present system is generally made up of a ligand screening apparatus 100 for 
45 evaluating and selecting a substance (ligand) that binds to a protein, and an external system 200 for providing external 
databases concerning ligand 3D structure data and protein 3D structure data, as well as a variety of external programs 
that are communicatively connected via a network 300. 

[0092] The network 300 has a function of mutually connecting the ligand screening apparatus 100 and the external 
system 200, and is implemented by the Internet or LAN, for example. 

50 [0093] The external system 200 is mutually connected with the ligand screening apparatus 1 00 via the network 300 
and has a function of providing a user with external databases concerning ligand 3D structure data and protein 3D 
structure data as well as websites for executing various external programs. Here, the external system 200 may be 
implemented by a WEB server, ASP server or the like, and its hardware configuration may be implemented by commer- 
cially available workstation, personal computer and the like information processing devices and attached devices thereof. 

55 Further, each function of the external system 200 is realized by a CPU, disc device, memory device, input device, output 
device, communication controlling device and the like in hardware configuration of the external system 200, and programs 
controlling them. 

[0094] The ligand screening apparatus 1 00 generally includes, a control unit 1 02 such as CPU forcentrically controlling 



15 



EP 1 724 697 A1 



the overall ligand screening apparatus 1 00; a communication controlling interface unit 1 04 connected with a communi- 
cation device (not shown) such as router connected with communication line or the like; a memory unit 106 for storing 
various databases and files; and an input/output controlling interface unit 1 08 connected with an input device 1 1 2 and 
an output device 114, and these units are communicably connected via certain communication paths. Further, the ligand 
5 screening apparatus 100 is communicably connected to the network 300 via a communication device such as router 
and a wired or wireless communication line such as an exclusive line. 

[0095] Various databases, tables and files (ligand coordinate database 106a to ligand evaluation resultfile 106f) stored 
in the memory unit 106 is a storage unit such as stationary disc device, and stores various programs, tables, files and 
databases, files for web pages used for various processings. 

10 [0096] Among these constituents of the memory unit 1 06, the ligand coordinate database 1 06a is a ligand coordinate 
data storing unit that stores ligand coordinate data. A protein coordinate database 106b is a protein coordinate data 
storing unit that stores protein coordinate data. A post-structural-change protein coordinate data file 106c is a post- 
structural-change protein coordinate data storing unit that stores post-structural-change protein coordinate data selected 
by a post-structural-change protein coordinate data selecting unit 102a as will be described later. A designated spatial 

15 point file 1 06d is a designated spatial point storing unit that stores information concerning a spatial point designated by 
a spatial point designating unit 102b as will be described later. An interaction function calculation resultfile 106e is an 
interaction function calculation result storing unit that stores information concerning calculation result of interaction 
function calculated by an interaction function calculating unit 102c as will be described later. A ligand evaluation result 
file 1 06f is a ligand evaluation result storing unit that stores information concerning evaluation result of ligand evaluated 

20 by a ligand evaluating unit 1 02d as will be described below. 

[0097] The communication controlling interface unit 1 04 controls communication between the ligand screening appa- 
ratus 100 and the network 300 (or communication device such as router). In other words, the communication controlling 
interface unit 1 04 has a function of communicating data with other terminal via a communication line. 
[0098] The input/output controlling interface unit 1 08 controls the input device 1 12 and the output device 114. Here, 

25 as the output device 1 1 4, a speaker or the like as well as a monitor (including home TV set) may be used (hereinafter, 
the output device 1 1 4 is sometimes referred as "monitor"). As the input device 1 1 2, a keyboard, a mouse, a microphone 
or the like may be used. A monitor also realizes a pointing device function together with a mouse. 
[0099] The control unit 1 02 has an internal memory for storing control programs such as OS (Operating System), data 
in need and the like, and conducts information processing for executing various processings by these programs and the 

30 like. Functionally, the control unit 102 generally includes the post-structural-change protein coordinate data selecting 
unit 1 02a, the spatial point designating unit 1 02b, the interaction function calculating unit 1 02c and the ligand evaluating 
unit 1 02d. 

[0100] Among these constituents of the control unit 1 02, the post-structural-change protein coordinate data selecting 
unit 1 02a is a post-structural-change protein coordinate dataselecting unit that calculates structural change on coordinate 

35 data of protein using an induced-fit parameter reflecting induced-fit while taking dynamic behavior into account, and 
selects post-structural-change protein coordinate data. The spatial point designating unit 102b is a spatial point desig- 
nating unit that selects a spatial point at which superposition with ligand is to be conducted, from post-structural-change 
protein coordinate data selected by the post-structural-change protein coordinate data selecting unit 102a. 
[0101] The interaction function calculating unit 102c is an interaction function calculating unit that calculates an inter- 

40 action function when the protein and the ligand bind using the spatial point designated by the spatial point designating 
unit 102b and ligand coordinate data of ligand. Here, the interaction function calculating unit 102c further includes an 
interaction function optimizing unit 102c1 and an interaction energy optimizing unit 102c2 as shown in FIG. 117. The 
interaction function optimizing unit 1 02c1 is an interaction function optimizing unit that optimizes so that the score of the 
interaction function is maximum. The interaction energy optimizing unit 102c2 is an interaction energy optimizing unit 

45 that calculates interaction energy with the protein for the superposed ligand after optimization of the interaction function 
by the interaction function optimizing unit 1 02c1 , and optimizingthe interaction energy while finely adjusting conformation 
of ligand 3D structure data. 

[0102] Returning to FIG. 116, the ligand evaluating unit 102d is a ligand evaluating unit that evaluates a ligand that 
binds to the protein based on the interaction function calculated by the interaction function calculating unit 1 02c. Here, 
50 the ligand evaluating unit 102d also includes a reevaluating unit 102d1 as shown in FIG. 118. The reevaluating unit 
102d1 is a reevaluating unit that reevaluates a ligand that binds the target protein based on an interaction function 
calculated by the interaction function calculating unit 102c that is executed after largely changing conformation of ligand 
3D structure data following optimization by the interaction energy optimizing unit 102c2. 
[0103] The details of the processes executed by these units will be described later. 

55 

[System process] 

[0104] Here, one example of main process of the present system in the embodiment configured as described above 
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will be explained in detail with reference to FIG. 1 15, for example. FIG. 11 5 is a flowchart showing one example of main 
process of the present system in the present embodiment. Referring now to FIG. 1 1 5, ligand screening using 3D structure 
of protein and induced fitting will be explained. 

[0105] First, the ligand screening apparatus 100 prepares database of ligand including 3-dimentional coordinate by 
a process of control unit 102 and stores the prepared database in the ligand coordinate database 106a of the memory 
unit 1 06 (Step SO). Here, as a database of ligand, for example, available compound databases such as ACD, imaginary 
compound data collected by drawing a compound and the like may be used. Preferably, database of ligand is converted 
into three dimensional by molecular dynamic technique. 

[01 06] Next, the ligand screening apparatus 1 00 selects 3D structure of target protein for screening the ligand database 
prepared in Step SO for a specified ligand by a process of the control unit 1 02, acquires 3D structure data (3D coordinate) 
of the selected protein, and stores it in the protein coordinate database 106b of the memory unit 106 (Step S10). As the 
3D coordinate, 3D structure coordinate created by PDB which is a public database or by homology modeling method is 
preferably used. 

[0107] Next, the ligand screening apparatus 100 conducts normal mode calculation for the target protein selected in 
Step S10 by a process of the post-structural-change protein coordinate data selecting unit 102a, and determines fluc- 
tuation in position of main chain atom and fluctuation in dihedral angle (Step S20). More specifically, first, a parameter 
representing dynamic behavior of the target protein specified in Step S10 is acquired from the database of calculation 
result by a normal mode analysis method or a parameter is acquired by conducting secondary structure determining 
calculation. 

[0108] First, a method of acquiring the parameter representing dynamic behavior of the protein in Step S20, by a 
normal mode analysis method will be explained. The normal mode analysis method is a method of approximating potential 
energy as a secondary function of displacement, and solving a dynamic equation precisely, thereby analyzing microscopic 
vibrations around the optimized structure. The dynamic equation to be solved is the following Formula (1) or Formula 
(2). For details of the normal mode analysis method, see "Wilson, E. B., Decius, J. C, and Cross P. C. 1995 Molecular 
vibration. McGraw-Hill.". 



TUA = VU 



(2) 



[0109] Here, co k is eigen value, U ik is eigen vector, and 5y is delta of Kronecker. Ty and Vy are respectively concern 
motion energy E k and potential energy V, and the following Formula (3) and Formula (4) are provided. 



(3) 



U v dw — K-vdw 

i,j(>i+2) 



3.8 

v D uy 



,12 



3.8 



(4) 



[0110] Here, qj is a coordinate corresponding to degree of freedom of vibration, qj' (it means "qj dot" in Formula (3)) 
is differentiation of qj by time, qj is expressed by the following Formula (5). 
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k 

5 

[01 1 1 ] Here, A jk is a coefficient which connects motion Q k and individual atomic motion qj. qj° is an optimized coordinate. 
Q k is normal mode shown by the following formula. 

Qk =a k cos(co k t + 8 k ) 

[0112] Here, a k and 5 k are determined as an initial condition. 

[0113] Next, in Step S20, with respect to a reference protein, using the eigen value and the eigen vector obtained 
15 above, positional fluctuation of each Ca atom at a certain temperature and a certain eigen value is calculated, and this 
value of fluctuation is defined as a value of fluctuation of the amino acid in which the Ca is contained. As to a value of 
fluctuation of each amino acid in a target protein, using the alignment in Step S50, a value identical to that for the 
reference protein is applied as a value of fluctuation of the target protein in a corresponding amino acid residue pair 
based on comparison of the target sequence and the reference sequence. For those whose values of fluctuation are 
20 not obtained, a predetermined value is applied. The value of each amino acid in the target protein thus obtained is used 
as a parameter representing dynamic behavior of the target protein. 

[0114] Now, a method of acquiring a parameter representing dynamic behavior of protein by secondary structure 
determining calculation in Step S20 will be explained. Secondary structure determination is calculated from 3D structure 
coordinate of protein. As software, DSSP, STRIDE and the like are preferred, but basically, a method which makes 

25 determination based on angle of twist of main chain of protein and hydrogen bond pattern may be used. Here, "DSSP 
(Dictionary of protein secondary structure of protein)" is software that determines ce-helix and (3-sheet by using a PDB 
format file as an input file and analyzing a hydrogen pattern of main chain, an internal-rotation angle and the like. For 
details of the DSSP, see "Kabsch, W. & Sander, C. (1983) Dictionary of protein secondary structure: pattern recognition 
of hydrogen-bonded and geometrical features. Biopolymers, 22:2577-2637". "STRIDE (Protein secondary structure 

30 assignment from atomic coordinate)" is software that determines a-helix and p-sheet by using a PDB format file as an 
input file and analyzing a hydrogen pattern of main chain, an internal- rotation angle and the like. For details of STRIDE, 
see "Frishman, D & Argos, P. (1995) Knowledge-based secondary structure assignment. Proteins: structure, function 
and genetics, 23, 566-579". 

[0115] Secondary structure calculation using the above software or the like is conducted on the reference protein, 
35 and a-helix structure, p-sheet structure or loop structure formed by each amino acid is determined. As to secondary 
structure determination of each amino acid in a target protein, using the alignment in Step S50, the same value for the 
reference protein is applied as secondary structure of the target protein in a corresponding amino acid residue pair based 
on comparison of the target sequence and the reference sequence. For those whose secondary structures are not 
determined, a predetermined result is applied. The secondary structure of each amino acid in the target protein thus 
40 obtained is used as a parameter representing dynamic behavior of the target protein. 

[0116] Here, in Step s20, as a parameter representing dynamic behavior of the target protein, a calculation result 
acquired by the normal mode analysis method of the reference protein is preferably used. As such a calculation result, 
those separately stored in a database is used. Also, the secondary structure determining calculation result is used in 
place of normal mode analysis calculation when a reference protein for which normal mode analysis is not conducted 
45 is used. 

[01 1 7] Returning again to a main process of FIG. 115, the ligand screening apparatus 1 00 conducts molecular dynamic 
calculation using intensity of fluctuation of target protein determined in Step S20 by a process of post-structural-change 
protein coordinate data selecting unit 102a as a constraint condition (Step S30). 

[0118] Concretely, first, positional constraint energy of main chain "U position" is introduced into Formula 6 as shown 
50 below, and minimization (condition: temperature 300K, rectangular water bath capable of placing two at least spheres 
of water molecule in all directions outside the surface of the receptor, force field: AMBER [S. J. Weiner, P. A. Kollman, 
D.A. Case, U.C. Singh, C. Ghio, G. Alagona, S. Profeta, &, P. Weiner (1 984) A new force field for molecular mechanical 
simulation of nucleic acids and proteins J. Am. Chem. Soc. 106, 765-784]) is effected using APRICOT [Yoneda S. & 
Umeyama H. (1992) Free energy perturbation calculations on multiple mutation base J. Chem. Phys. 97, 6730-6736] 
55 with the variation of the initial receptor backbone constrained. 
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U position = K position * R 2 ( 6 ) 

5 [0119] Here, the "K position" is, for example, 300.0 and R is difference from a initial coordinate. Next, dihedral angle 
constraint energy "U dihedral angle" shown by Formula 7 below is introduced to APRICOT, and MD of the minimized 
receptor is calculated (condition: temperature 300K, rectangular water bath capable of placing two at least spheres of 
water molecule in all directions outside the surface of the receptor, force field: AMBER). 

10 

U dihedral angle = K dihedral angle * (0 - 9 equivalent^ (7) 

[0120] 6 is dihedral angle (unit: rad). For "K dihedral angle", a maximum value and a minimum value are designated 
15 so that uneven constraint that corresponds to the fluctuation of the main chain is applied to each dihedral angle within 
the range between the above values. Hereinafter, MD that is conducted under constraint of dihedral angle is called MD 
with dihedral angle constrained. 

[0121] Next, dynamic structure of receptor is clustered for obtaining protein structure coordinate by MD calculation 
with dihedral angle constrained. For a previously designated active site, a population made up of active sites in structures 

20 obtained by superposing receptors of every 1 00 femtoseconds in the course of MD and an active site in an initial structure 
is established. Since dynamic information of side chain is easily lost by clustering, at first, side chain dihedral angles 
wherein dihedral angle % of s i de chain is conserved within an average angle ± 20.0 degrees in oc% of the population 
are collected. However, when it is determined that % closer to the main chain is not conserved, the subsequent % is 
considered as not being conserved. 

25 [0122] Next, structures that conserve every dihedral angle of side chain in collected those are extracted from the 
population. Then, to compare similarity of the extracted structures, when rms (root mean square) of all atoms is (3 
angstroms or less, it is determined as the same structure, and one is deleted, and based on the finally selected structures, 
a receptor dynamic structure cluster is created. As to an atom constituting unconserved dihedral angle %, it is allowed 
to collide with ligand in that calculation binding to active site because it is likely to vary, a and p are constants. 

30 [0123] Returning again to the main process of FIG. 1 15, the ligand screening apparatus 100 designates a group of 
spatial points for locating a ligand to a ligand binding site of the target protein by a process of the spatial point designating 
unit 102b (Step S40). Concretely, among the plural protein 3D structure coordinates created in Step S30, one is randomly 
selected. A spatial point in a protein coordinate is designated, for example, by the methods (1) to (4) below. 

35 (1 ) Designation of spatial point by generation of dummy atom 

[0124] Focusing on the hydrogen bond in interaction between ligand and protein, a hydrogen bond site in the protein 
is designated as a spatial point. The important issues in a hydrogen bond are distance and angle. That is, a hydrogen 
in hydrogen bond donor (hereinafter referred to as "donor") is required for calculating an angle. 
40 [0125] In the present embodiment, when there is no hydrogen atom in an active site and the ligand, a dummy hydrogen 
atom is generated in accordance with the following rule. 

1) A dummy atom is generated in an equilateral triangle centered at sp 2 orbital atom (FIG. 2). More specifically, as 
shown in FIG. 2, a dummy hydrogen atom (B) is generated at a free position in the equilateral triangle centered at 

45 the nitrogen atom (A) of sp 2 orbital type. 

2) As to a sp 3 orbital atom, when it is at a distance to form a hydrogen bond, it is deemed to rotate so as to share 
the hydrogen atom, so that only the distance is considered in calculation of hydrogen bond interaction. Therefore, 
no dummy atom is generated for the sp 3 orbital atom. 

so [0126] As to metal and water, since they can mediate binding between active site and ligand binding, a dummy atom 
is generated at an interacting position in the manner as described below. 

1) To metal such as iron, dummy atoms are generated in a regular octahedron (FIG. 3). That is, as shown in FIG. 
3, dummy atoms (B) are generated at free positions in the regular octahedron centered at zinc (A). 
55 2) To water, dummy atoms are generated in a regular tetrahedral. It is not necessary to generate a dummy atom in 

the direction in which it interacts with an active site. 
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(2) Designation of spatial point using structure-activity relationship information. 

[0127] Also in the present invention, a spatial point is designated by inputting information of the following items 
(A) to (D) in focus of structure -activity relationship (SAR) information of ligand. 

(A) Atom of active site obtained from SAR information (hereinafter referred to as "A atom"). It follows PDB format. 

(B) Atom type of ligand which is expected to interact with "A atom" (hereinafter referred as "Btype"). It follows MOL2 
format of SYBYL. 

(C) Intensity of interaction between "A atom" and "B type" (hereinafter referred to as "C intensity"). 

(D) Distance of interaction between "A atom" and "B type" (hereinafter referred to as "D distance") (unit: angstrom). 

[0128] In the present embodiment, a spatial point may be created according to the rules shown 1 ) to 4) below based 
on the input information (A) to (D) and using an initial coordinate of ligand in an active site of protein. 

1) When "A atom" is donor or metal and water (when SAR information on the active site side is designated as 
hydrogen bond donor or metal atom), a position and a surrounding at "D distance" from "A atom" with respect to 
the direction of dummy atoms generated in "(1) Designation of spatial point by generation of dummy atom" are 
selected as initial coordinates (FIG. 4 and FIG. 5). 

2) When "A atom" is a sp 3 orbital atom (when designation of SAR information on the active site side is a sp 3 orbital 
atom), a surrounding at "D distance" from "A atom" is selected as initial coordinate (FIG. 6). 

3) When "A atom" is a hydrogen bond acceptor (hereinafter referred to as "acceptor") (when designation of SAR 
information on the active site side is a hydrogen bond acceptor), a position and a surrounding at "D distance" on a 
bonding extension line of "A atom" are selected as initial coordinates (FIG. 7). 

4) In other cases (when designation of SAR information on the active site side is an atom other than the above), a 
position on the surface of a sphere with radius of "D distance" centered at "A atom" is selected as an initial coordinate 
(FIG. 8). 

[0129] Here in contrast with the above 1 ) to 4), an initial coordinate of ligand may be designated directly. 
[0130] Returning again to the main process of FIG. 115, the ligand screening apparatus 100 superposes each atom 
in the ligand for one ligand specified in Step SO to the group of spatial points determined in Step S30 by a process of 
the interaction function calculating unit 102c (Step S50). Concretely, an initial coordinate and a ligand are superposed 
by the following procedures (1) to (4) which are alignment creating algorithm using distance matrix (DALI) [Holm, L, & 
Sander, C. (1993) Protein Structure Comparison by Alignment of Distance Matrices J. Mol. Biol. 233, 123-138] modified 
for low molecules. 

(1) It is often the case that atom types of ligand correspond to those of "Btype" multiply. In view of this, a pair wherein 
an atom type of "B type" and that of ligand can be regarded as being identical is created by using random numbers. 
In such a pair, atom types of ligand should not overlap. 

(2) Since "B type" includes a plurality of initial coordinates by Step S40, selection of initial coordinate is also selected 
by using random numbers. 

(3) Create a distance matrix of the selected initial coordinate and ligand, and calculate Sscore(i, j) which is an 
interaction function as follows. 



[0131] Here, djj s is an distance between i-th spatial point and j-th spatial point in the target protein. djj s is an interatomic 
distance between i-th atom and j-th atom in a compound, a is a coefficient for making Sscore(iJ) the maximum value 
when the group of spatial points in the target protein and the compound completely overlap with each other. (3 is a 
coefficient for giving a threshold value by which it can be defined as "overlapping". 



when i is not equal to j 




when i is equal to j 
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[0132] Preferably, a is 1 .5 and p is 0.8. 

(4) Repeat (1) to (3) plural times (for example, 10000 times) to find a pair whose Sscore(i, j) is maximum, and 
superpose a ligand to an initial coordinate based on that pair information. 

5 

[0133] Next, the ligand screening apparatus 100 acquires a parameter representing dynamic behavior of protein 
according to the calculation result determined in Step S20 and Step 30 for superposition in Step S50 by a process of 
the interaction function calculating unit 102c, and calculates interaction energy between ligand and protein using the 
parameter while finely adjusting conformation of the ligand (Step S60). In other words, with respect to the ligand that is 
10 superposed in Step S50, interaction energy with the protein is calculated while optimizing the conformation by fine 
adjustment. Fine adjustment of conformation of the ligand may be conducted by translating, rotating or changing coor- 
dinates in such a range that the angle around a single bond will not exceed 0.3 by RSMD, about the ligand coordinate 
calculated in Step S50. 

[0134] Here, the fine adjustment of conformation of ligand coordinate data is preferably optimized by random search, 
is for example. In random search, infinitesimal changes between active site of protein and ligand are repeated, for example, 
8000 times in accordance with the items 1) to 3) below, to make an optimum energy "U optimum" be minimum. 

1) Up to five bonds are selected at random from rotatable bonds, and each bond is randomly rotated within the 
range of ± 1 0.0 degrees for changing the conformation of ligand. This process is effected, for example every three 

20 times. 

2) In each direction of x, y, z axes, ligand is randomly translated within the range of ±1.0 angstrom. This process 
is effected, for example every twice. 

3) In each center coordinate of rotation, a center coordinate of the rotation is randomly translated within the range 
of ±1.0 angstrom, and the ligand is randomly rotated within the range of ±5.0 degrees with respect to each of 

25 direction of three orthogonal axes. This process is effected, for example every five times. 

[0135] Here, optimum energy "U optimum" is defined by the following formula. The energy functions shown on the 
right-hand side will be individually explained below. 

30 

U optimum = U SAR + U hydrogen + U hydrophobic + 
U stacking + U collision + U internal 

35 [0136] As to van der Waals radius and interatomic interaction distance, references were made to AMBER99 [J. Wang, 
P. Cieplak & P. A. Kollam (2000) How well does a restrained electrostatic potential (RESP) model perform in calculating 
conformational energies of organic and biological molecules? J. Comput. Chem, 21, 1049-1074] and MM3 parameter 
[Ma B., Lii J.-H., Allinger N.L. (2000) Molecular polarizabilities and induced dipole moments in molecular mechanics J. 
Comput. Chem. 21, 813-825]. 

40 

(a) Energy function "U SAR " concerning SAR information 

As an index according to SAR information, energy U SAR is defined as follows. 

N 

U SA R = 5>(0 

9(i)=K SAR (i)*(RsAR(0-R) 2 -5 

50 

N is number of SAR information, R is distance from "A atom" to an interacting atom on the ligand side, K SAR (i) is 
i-th "C intensity", R SAR (i) is i-th "D distance". 6" is, for example, 20.0. 

(b) Energy function "U hydrogen" concerning hydrogen bond 

Assuming only one hydrogen bond is formed with respect to one donor (acceptor) of ligand, an acceptor (donor) on 
55 the active site side located at shortest distance is chosen, a binding angle 0 via a hydrogen (see FIG. 9, the acutest 

hydrogen bond angle is defined as 0 when pluralhydrogen atoms are attached to a donor atom) is calculated, and 
in either of the two conditions described below, is calculated. In FIG. 9, A is a donor, B is hydrogen, C is acceptor 
and e is angle of hydrogen bond. 
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N 

U hydrog en=E<PO) 
i=l 

(1) When the donor atom is sp 3 orbital type, or angle of hydrogen bond e is within ±30.0 degrees, 



Else, (p(i)= - 



^hydrogen (0 

(RhydrogenW-R + lO) 



(2) When angle of hydrogen bond e is equal to or more than ±30.0 degrees, 

tct% t> f\ ^ hydrogen (0 

IfR >R hydrogen, *(') = " (r _ J 3 



Else, (p(i)=- 



-^hydrogen (0 

(Rh ydI o se n(i)-R + io)*e 



N is number of sum of donors and acceptors of ligand, R is distance between two atoms forming a hydrogen bond, 
"K hydrogen (i)" and "R hydrogen (i)" are intensity and distance of interaction of hydrogen bond determined for each 
atom type. 

(c) Hydrophobic interaction energy function "U hydrophobic" Atoms that are capable of hydrophobic interaction in 
active site (side chains of ALA, CYS, PHE, ILE, LEU, MET, PRO, VAL, TRP and TYR, except of hydroxyl group of 
TYR) and in ligand (carbon atom) are serially numbered, and when interatomic distance R between i-th atom in the 
active site and j-th atom in the ligand is within a cutoff, (|>(i,j) is calculated. 



M N 

U hydrophobic = 2 X °P (*> j) 
i=l j=l 



IfR > R hydrophoblc (i, j), cp (i J) = 



K hydrophobic Qj) 
( R ~ R hydrophobic 0 J) + 1 °) 



Else, cp (i J) = -K hydrophobic (i, j) 



hydrophobit 

M is number of atoms in active site that are capable of hydrophobic interaction, N is number of atoms in ligand site 
that are capable of hydrophobic interaction. "K hydrophobic (i,j)" and "R hydrophobic (i,j)" are intensity and distance 
of hydrophobic interaction determined for each atom type. The cutoff is, for example, 8.0 angstroms. 
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(d) Stacking energy function "U stacking" 

Atoms forming an aromatic ring in active site and ligand are serially numbered, and a center coordinate of aromatic 
ring is calculated for active site. When interatomic distance R between i-th atom in active site and j-th atom in ligand 
is within a cutoff, taking a center coordinate of aromatic ring formed by the i-th atom as i', and taking an atom in 
5 ligand which is closest from j-th atom and belonging to the same aromatic ring as j\ Zii'j=6jj, Zi'ij=6jj, Zii'j'=e r j>, 

znj'=e|j, are calculated (FIG. 10), and when e j3 j and are 90.0 degrees ±10.0 degrees, "R boundary" and "6 
boundary" are determined, and in either of the two conditions described below, (j>'(i,j) is calculated. In FIG. 10, i' 
represents center of aromatic ring in active site, i represents aromatic ring atom in active site, j and j' represent 
aromatic ring atoms in ligand. 

10 

M N 

u stackmg = XZ ( p( i 'j) 
i=i j=i 

15 

If R boundary < 0.0, 

cp(i, j) = -K stacking (i, j)*R boundary 

Else 

cp(ij) = -K stacking (i,j)*6 boundary 
20 r boundary =1 .0 - (R stacking(ij)- R) 2 

6 boundary = 11.0 - 01 



180.0 v 7 

M is number of atoms forming an aromatic ring in active site, N is number of atoms forming aromatic ring in ligand, 
and "K stacking (i,j)" and "R stacking (i j)" are distance and intensity of stacking determined for each atom type, n 
30 is the ratio of the circumference of a circle to its diameter, and 0 is an angle at which 0 is minimum in 0 r j> and 6^. 

The cutoff is, for example, 5.0 angstroms. 

(e) Intermolecular collision energy function (elastic collision energy function) "U collision" 

As "elastic energy", the following function "U collision" may be applied while taking local flexibility of protein into 
account. When interatomic distance R between an i-th atom of a main chain or a (conserved) side chain atom with 
35 a little dynamic behavior in an active site, and j-th atom in the ligand is not more than collision distance "Rcollision 

(i j)", calculation of <>(i j) is defined as follows. 



M N 

u collision =2X>0j) 

i=l j=l 

(p(ij)= K collision * (Rcollision (i, j)-R) 2 

45 

M is number of atoms in an active site that prohibit collision, N is number of atoms of ligand. "K collision" is preferably 
1 000.0. "Rcollision(i,j)" is a sum of van der Waals radii of i-th atom in the active site and j-th atom in the ligand. 
Here, with respect to each atom in the active site, when weighting w(i) that allows collision is defined, the function 
"U collision" shown by the following Formula is used. w(i) is real number ranging from 0 to 1 . 

50 

M N 

UeoiHsion =XZq>(U) 
i=l j=l 

55 (p(i,j)= w(i)*K collision * (R collision (ij)- R) 2 

M is number of atoms in an active site. N is number of atoms in ligand. "K collision" is preferably 1 000.0. "Rcollision 
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(i j)" is a sum of van der Waals radii of i-th atom in active site and j-th atom in ligand. 
(f) Ligand internal energy "U internal" 

[0137] Since there is a case that a bond is broken due to errors that occur when a rotatable bond is changed little by 
5 little, calculations for "<|> bond length (i)" and > collision (i,j)" are defined so as to prevent occurrence of atomic collision 
within a ligand. 



L M N 

10 



Sterna! ~ X *P bond length 0 j) + X) 2 ^collision j) 
i=l i«l j=l 



cpbond length (i) = K bond length * {l 000.0 * (R bond length (i)- R x )} 2 
cp collision (i 5 j) = K collision * (R collision - R 2 ) 2 

[0138] L is number of rotatable bonds, M is number of atoms of ligand, and N is number of non-binding atoms of i-th 
atom. Preferably, "K bond length" is 100.0. "R bond length (i)" is bond length of initial structure. Preferably, "K collision" 
is 150.0 and "R collision" is 2.2 angstroms. and R 2 are distances between two atoms. 

[0139] Returning again to the description of the main process of FIG. 1 15, by a process of the interaction function 
calculating unit 102c, the ligand screening apparatus 100 largely changes conformation of ligand coordinate data with 
respect to the ligand determined in Step S50, restarts the flow from Step S50, and repeats the process from Step S60 
to Step S70 (optimization). Change of conformation may be conducted by translating, rotating or changing coordinates 
in such a range that the angle around a single bond will be equal to or more than 0.3 by RSMD, about the ligand coordinate 
calculated in Step S50. 

[0140] Here, optimization by largely changing conformation of ligand is achieved by selecting five rotatable bonds at 
random, for example, with respect to the conformation in the "U optimized" which is energy optimized in Step 50, and 
rotating them randomly in accordance with a rotation angle interval determined for each atom type. Then the process 
subsequent to Step S50 and Step S60 is repeated, for example, 5000 times. 

[0141] After changing conformation of ligand, internal energy of the ligand "U internal" is calculated, and if the value 
is 500.0 or more, a subsequent calculation may be skipped, and the next ligand conformation may be caused to generate. 
[0142] Next, the ligand screening apparatus 1 00 determines interaction energy between target protein and ligand that 
is obtained up to Step S70 by a process of interaction function calculating unit 102c (Step S80). Concretely, a complex 
coordinate between protein and ligand which provide optimum value in "U optimum" from Step S40 to Step S70, as well 
as optimum energy "U optimum" are calculated. 

[0143] Next, the ligand screening apparatus 100 returns to Step S40 by a process of the control unit 102, selects 
another ligand in Step SO, and conducts the calculations up to Step S80 by processes of the respective processing units 
(Step S90). Here, the processes from Steps S40 to Step S90 are conducted for all ligands in the compound database 
in Step SO. 

[0144] Next, the ligand screening apparatus 100 compares the interaction energies determined in Step S90 for the 
ligands in Step SO by a process of ligand evaluating unit 1 02d, and selects a ligand that is expected to bind the target 
protein (Step S100). Concretely, based on a complex coordinate between protein and ligand and optimum energy "U 
optimum" evaluated up to Step S90, a compound (ligand) which has a possibility to bind to the protein is selected from 
the database in Step SO. 

[0145] Now we end description of the main process of the system. 

[0146] As described above, according to the ligand screening apparatus 100, it is possible to evaluate and select a 
substance (ligand) that binds to a target protein by means of interaction function. Concretely, 3D structure of the target 
protein is subjected to normal mode calculation, intensity of fluctuation of each amino acid is determined, and molecular 
dynamic calculation is conducted using the intensity of fluctuation as a constraint condition. Accordingly it is possible to 
conduct molecular dynamic calculation while preventing the 3D structure of the protein from largely deviating from the 
structure at which energy is minimized. It is also possible to clearly describe the physicochemical property concerning 
dynamic behavior of a protein. Further, it is possible to use a result of normal mode analysis or a result of secondary 
structure determination of protein as a function expressing dynamic property of protein. Also when there exist plural 3D 
structural coordinates as is the case for analytical result of NMR spectrum, it is possible to evaluate all of the plural 
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coordinates equally, in short time and full-automatically for screening for a ligand that binds to the target protein. 
[0147] Further, according to the ligand screening apparatus 100, when arbitrary 3D structure of a protein comprising 
an any not only single but also plural chain(s) is given, a parameter reflecting induced-fit from the 3D structure of the 
protein and post-structural-change 3D structure coordinate are calculated in advance by a normal mode calculation 
5 method or by a molecular dynamic calculation method; an interaction function when the protein binds to other substance 
is defined using the above parameter and the post-structural-change 3D structure coordinate; and a substance that 
binds to the protein is evaluated and selected based on the interaction function by means of computer program. 
[0148] Further, according to the ligand screening apparatus 1 00, when a ligand that binds to the protein is selected, 
the series of processes shown in (0) to (8) are executed full-automatically or manually. 

10 

(0) Select one ligand from compound database. As 3D structure of target protein, a plurality of structural-change 
coordinates considering dynamic behavior using a parameter reflecting induced-fit are prepared, and one of them 
is selected at random. 

(1) Designate a spatial point in the protein at which superposition is to be conducted. 

is (2) Select a pair of an atom in the ligand selected in (0) and a spatial point designated at random in (1) so that no 

overlapping occurs. 
(3) Calculate the following Sscore(iJ). 



20 



Sscore(ij)=Y< 
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when i is not equal to j 



when i is equal to j 

ax(l-j3) 



30 dy s is distance between ith spatial point and j-th spatial point in the protein. djj c is interatomic distance between i-th 

atom and j-th atom in the compound, oc is a coefficient for making Sscore(i,j) the maximum value when the group 
of spatial points in the target protein and the compound completely overlap with each other. |3 is a coefficient for 
giving a threshold value by which it can be defined as "overlapping". 
(4) Make adjustment so that the score in (3) is maximum. 

35 (5) Optimize interaction energy with the protein for the ligand superposed in (4) by fine adjustment of conformation. 

(6) Largely change the conformation of the ligand, and restart from (2) and repeat the process up to (5) to achieve 
optimization. 

(7) Conduct the course of (1) to (6) on the plurality of structural-change coordinates prepared in (0), and calculate 
an optimum protein-ligand complex coordinate and optimum energy "U optimum". 

40 (8) Conduct the course of (1) to (7) on every ligand in the compound database prepared in (0), and select a ligand 

that has a possibility to bind to the protein from the compound database. 

[0149] Also according to the ligand screening apparatus 1 00, when a parameter reflecting induced-fit of protein and 
a post-structural-change 3D structure coordinate are calculated by using a molecular dynamic calculation method, 3D 
45 structure of the target protein is subjected to normal mode calculation, intensity of fluctuation of each amino acid is 
determined, and molecular dynamic calculation is conducted using the intensity of fluctuation as a constraint condition. 
This realizes molecular dynamic calculation without causing large difference in 3D structure of the protein from the 
structure minimized by energy optimization "U hydrogen". 

[0150] Also according to the ligand screening apparatus 100, as a target function for evaluation of interaction between 
50 ligand and protein, afunction that expresses dynamic characteristic of protein is added as elastic energy to a conventional 
interaction energy function, and interaction energy is rapidly calculated from the 3D structure coordinate of protein, and 
physicochemical property concerning dynamic behavior of protein is clearly described. 

[0151] Also according to the ligand screening apparatus 100, as a function expressing dynamic characteristics of 
protein, a result of normal mode analysis or a result of secondary structure determination of protein is used. 
55 [01 52] Also according to the ligand screening apparatus 1 00, when the calculated protein has typical plu ral 3D structure 
coordinates, or plural 3D structural coordinates as is the case for analytical result of NMR spectrum, it is possible to 
evaluate all of the plural coordinates equally, in short time and full-automatically for screening for a ligand that binds to 
the target protein. 
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[First example] 

(Determination of parameter coefficient in MD with dihedral angle constrained and clustering) 

5 [0153] Using the ligand screening apparatus 1 00 according to the above embodiment, a fluctuation value of dihedral 
angle was calculated by a normal mode analysis. As shown in this first example, a fluctuation value of dihedral angle 
was adapted as a constraint condition in molecular dynamic calculation as "K dihedral angle" in the following formula. 

10 U dihedral angle = K dihedral angle * (0 - 8 equivalent^ 

[0154] 9 is dihedral angle (unit: rad). In practice, uneven constraint was applied to each dihedral angle such that it 
corresponds to fluctuation of dihedral angle of main chain within a range between a maximum value and a minimum 
15 value of "K dihedral angle" by designating such values. Accordingly, first example aims at determining appropriate 
maximum value and minimum value for "K dihedral angle". 

[0155] Further, after molecular dynamic calculation, post-structural -change coordinates were subjected to cluster 
analysis, and a representative structure was selected. At this time, for an active site that is designated in advance, active 
sites of structures obtainable by superposing receptors of every 100 femtoseconds during MD, and an active site of 

20 initial structure were considered as a population. Concretely, since dynamic information of side chain is easily lost by 
clustering, at first, side chain dihedral angles wherein dihedral angle % of side chain is conserved within an average 
angle ± 20.0 degrees in a% of the population are collected. However, when it is determined that % closer to the main 
chain is not conserved, the subsequent % is considered as not being conserved. Next, structures that satisfy the condition 
into every side chain dihedral angle (conserved side chain dihedral angle) in collected structure were extracted from the 

25 population. Then, to compare similarity of the extracted structures, when the rms (root mean square) of all atoms was 
p angstroms or less, it was regarded as the same structure, and one was deleted, and based on the finally selected 
structures, a receptor dynamic structure cluster was created. As to an atom forming unconsented dihedral angle %, it 
was allowed to collide in the active site and ligand binding calculation because it was easy to change, a and p are 
constants, and first example also aims at determining appropriate a and p. 

30 [0156] First example also aims at obtaining best dynamic structure of main chain in an active site being in contact 
with a ligand. For this purpose, in calculation of rms (root mean square), only four main chain atoms (N, Ca, C, O) in 
the active site were considered as objectives. 

[0157] Supposing that as a maximum and a minimum values of K dihedral angle and clustering coefficients a and p, 
those capable of reproducing the structure analyzed by NMR are appropriate, first, we conducted normal mode analysis 

35 using MODEL 1 of dihydrofolate reductase (DHFR, PDB code: 1LUD) whose structure was analyzed to determine a 
fluctuation value by NMR, and then conducted molecular dynamic calculation. After the molecular dynamic calculation, 
we conducted clustering of receptor dynamic structure. Receptor residues located within 6 angstroms from each atom 
in the ligand contained in 1 LUD (MODEL 1) were defined to form an active site. Further, for MD, results of 0 to 0.1 
nanosecond, were used. Here, MD was conducted exhaustively for constraint ranging from a minimum to a maximum 

40 value of 0 to 1000 (intervals of 100), clustering coefficient a ranging from 0% to 90% (intervals of 10%), coefficient p 
ranging from 0.1 angstrom to 0.6 angstrom (intervals of 0.1 angstrom) with reference to the NMR structure average rms, 
and coefficients were determined by comparing every result with the NMR structure in 1 LUD. 

[0158] As a reference for determining coefficient p in receptor dynamic structure clustering, NMR structure average 
value was determined. In NMR structures in PDB (the Protein Data Bank), receptor is a simple protein and in one PDB 
45 file more than ten patterns of NMR structures were found. We tried to determine a NMR structure average rms of active 
site for 1 17 kinds of substrates including a ligand. 

[0159] First, in MODEL 1, receptor residues contained within 6 angstroms radius from each atom in the ligand were 
defined as forming an active site. Then for each of structure other than MODEL 1 , rms with respect to the active site of 
MODEL 1 was obtained, and then average rms thereof was determined. When the average rms is equal to or more than 

50 1.0 angstrom, since it can be considered as an apparent dynamic structure, such a PDB file was excluded from the 
objective. As a result, objective PDB files were reduced to 71 kinds. Average rms of 71 kinds were further averaged to 
give NMR structure average rms. The NMR structure average rms obtained in this manner was 0.62. 
[0160] As to determination of appropriate maximum and minimum values for "K dihedral angle" and determination of 
coefficients a and p in clustering, each parameter value was compared with NMR structure. 

55 [0161] Since 1 LUD includes 24 kinds of MODEL, and first example uses MODEL 1 as an objective, active sites of 23 
kinds other than MODEL 1 were considered as true structures (observed experimentally). In each receptor dynamic 
structure cluster outputted as a result of calculation, rms between MODEL 1 and each true structure was calculated. 
The minimum rms among the calculated rms was set as "RMS minimum", and an average value of "RMS minimum" 
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obtained from each receptor dynamic structure cluster was set as a score. Then a parameter at which the score is 
minimum was adopted. 

[0162] In FIG. 1 1, a result of normal mode analysis in MODEL 1 of 1LUD is shown, and comparison results of score 
and parameter are shown in FIGs. 12, 13, 15-18. In FIG. 11 , intensity of fluctuation in dihedral angle $ (orange A), \j/ 
(green B) is shown. The closer to 0.0 is intensity of the fluctuation, the stronger the constraint of dihedral angle becomes 
in the molecular dynamic (MD) calculation. Secondary structures determined by STRIDE are also shown: oc-helix (red 
D) and p-sheet (blue D). The purple C shows an active site. In FIG. 12, the normal mode analysis demonstrated that 
coefficient a is preferably 70%. However, since there are cases that clustering accuracy was deteriorated at 70% when 
generalization was applied, in first example, 80% was selected for a value of coefficient a. As shown in FIG. 13 and 
FIGs. 15-18, clustering coefficients a and (3 were fixed respectively at 80.0% and 0.4 angstrom. The score decreases 
as it comes blacker. 

[0163] These results demonstrated that the values of FIG. 14 are optimum as a constraint condition for reducing the 
score. As to validation of these values, Ca atom, side chain and all atoms besides the main chain atom were examined 
and the parameter values in FIG. 1 4 are proved to be optimum. 



[Second example] 

(Difference in molecular dynamic calculation in the presence/absence of constraint parameters) 

[0164] Molecular dynamic calculation adopting the constraint parameters calculated by the aforementioned ligand 
screening apparatus 100 was conducted until 2.0 nanoseconds. Then, how the structure changed in comparison with 
the case where the constraint parameters were not adopted in dynamic behavior of main chain atom in the active site 
was examined. 



Case 1) 



[0165] Examination was made on dihydrofolate reductase (MODEL 1 of 1LUD). Examination results are shown in 
FIGs. 1 9 to 21 . As the result of normal mode calculation, values that were determined in the above first example were 
adapted. 

[0166] In FIG. 19, with respect to MODEL 1 of each of 24 kinds of model structures described in PDB file of 1 LUD, 
rms of main chain atom in active site was calculated, and the average rms was displayed by dotted lines. In the case 
where dihedral angle is constrained (A), and in the case where dihedral angle is not constrained (B), difference of main 
chain atom in active site from its initial structure was represented by rms. 

[0167] FIG. 20 shows a comparison result with NMR structure when MD is calculated without constraint of dihedral 
angle. In FIG. 20, NMR structure (1 lud) is displayed in white, MD structure (Mud) id displayed in black. 
[0168] In Table 1 , a comparison result with NMR structure when MD is calculated without constraint of dihedral angle 
is shown. 

(Table 1 ) 
ACTIVE SITE ENTIRETY 
Ca ONLY 3.8903 0.2919 

MAIN CHAIN 3.8642 0.3335 
ENTIRETY 4.447 0.1398RMS 

[0169] In FIG. 21 , a comparison result with NMR structure when MD is calculated with constraint of dihedral angle is 
shown. 

[0170] In Table 2, a comparison result with NMR structure when MD is calculated with constraint of dihedral angle is 
shown. 



(Table 2) 
ACTIVE SITE 
Ca ONLY 0.6398 
MAIN CHAIN 0.6933 
ENTIRETY 1.2379 



ENTIRETY 
0.1194 
0.1053 
0.2157RMS 
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Case 2) 

[01 71 ] Here, we verified dependency on initial structure and presence/absence of constraint while selecting a structure 
(model structure) obtained by modeling according to FAMS [ Ogata K., Umeyama H. (2000) An automatic homology 
5 modeling method consisting of database searches and simulated annealing J. Mol. Graphics Mod. 18, 258-272], and 
X-ray structure as initial structures. Receptor residues contained within 1 0 angstroms radius from each atom in the ligand 
were defined as forming an active site. 

[0172] X-ray structure (3D structure) of cellular retinoic acid binding protein type ll(CRABP-ll) (PDB code: 1 CBQ) was 
used. As a reference protein, intestinal fatty acid binding protein (PDB code:1ICM) exhibiting 32.1% homology was 
10 selected, and a model structure was constructed by alignment of FIG. 22. In FIGs. 23, 24, and 25, results of X-ray 
structure comparison with model structure are shown. 

[0173] In FIG. 23, 3D structure (X-ray structure (red A) and model structure (blue B)) of 1 CBQ are shown. In FIG. 24, 
structure of 6-(2,3,4,5,6,7-hexahydro-2,4,4-trimethyl-1-methyleneinden-2-yl)-3-methylhexa-2,4-dienoic acid which is a 
substance shown in green in FIG. 23 is shown. In FIG. 25, difference between X-ray structure and model structure of 

is 1 CBQ is shown by rms. 

[0174] FIG. 26 is a view showing a result of normal mode analysis of X-ray structure of 1 CBQ, and FIG. 27 is a view 
showing a result of normal mode analysis of model structure of 1CBQ. In FIG. 26 and FIG. 27, intensity of fluctuation in 
dihedral angle (orange A), y (green B) is shown. The closer to 0.0 is intensity of the fluctuation, the stronger the 
constraint of dihedral angle becomes in the molecular dynamic (MD) calculation. Secondary structures determined by 

20 STRIDE are also shown: a-helix (red D) and (3-sheet (blue D). 

[0175] In FIG. 28, results of molecular dynamic (MD) calculation of X-ray structure and model structure of 1 CBQ are 
shown. Rms of a main chain atom in active site between X-ray. structure and model structure was determined. As shown 
in FIG. 28, A: initial structure is X-ray structure, without constraint of dihedral angle; B: initial structure is X-ray structure, 
with constraint of dihedral angle; C: initial structure is model structure, without constraint of dihedral angle; and D: initial 

25 structure is model structure, with constraint of dihedral angle. 

Case 3) 

[0176] X-ray structure of Flavodoxin (PDB code: 1J9G) was used. As a reference protein, flavodoxin (PDB code: 1 AHN) 
30 exhibiting 29.2% homology was selected, and model structure was constructed by alignment of FIG. 29. In FIG. 29, 
alignments of 1J9G and 1AHN are shown. In FIG. 30, 3D structures (X-ray structure (red A) and model structure (blue 
B)) of 1 J9G are shown. In FIG. 31 , structure of flavin mononucleotide which is a substance viewed in green C is shown 
in FIG. 30. In FIG. 32, difference between X-ray structure and model structure of 1J9G is shown by rms. 
[0177] In FIG. 33, a result of normal mode analysis of X-ray structure of 1J9G, and in FIG. 34, a result of normal mode 
35 analysis of model structure of 1J9G are shown. In FIG. 33 and FIG. 34, intensity of fluctuation in dihedral angle <j> (orange 
A), \|/ (green B) is shown. The closer to 0.0 is intensity the fluctuation, the stronger the constraint of dihedral angle 
becomes in the molecular dynamic (MD) calculation. Secondary structures determined by STRIDE[8] are also shown: 
a-helix (red D) and p-sheet (blue D). The purple C shows an active site. 

[0178] In FIG. 35, results of molecular dynamic (MD) calculation of X-ray structure and model structure of 1J9G are 
40 shown. Rms of a main chain atom in active site between X-ray structure and model structure was determined. In FIG. 
35, A: initial structure is X-ray structure, without constraint of dihedral angle; B: initial structure is X-ray structure, with 
constraint of dihedral angle; C: initial structure is model structure, without constraint of dihedral angle; and D: initial 
structure is model structure, with constraint of dihedral angle. 

45 Case 4) 

[0179] X-ray structure of Matrix metalloproteinase-8(MMP-8) (PDB code: 1MMB) was used. As a reference protein, 
MM P-3 (PDB code: 1 B3D) exhibiting 55.0% homology was selected, and a model structure was constructed by alignment 
of FIG. 36. In FIG. 36, alignments of 1 MMB and 1 B3D_A are shown. 
50 [0180] In FIG. 37, 3D structures (X-ray structure (red A) and model structure (blue B)) of 1 MMB are shown. In FIG. 
38, structure of batimastat which is a substance viewed in green C is shown. In FIG. 39, difference between X-ray 
structure and model structure of 1 MMB is shown by rms. 

[0181] In FIG. 40, a result of normal mode analysis of X-ray structure of 1 MMB, and in FIG. 41 , a result of normal 
mode analysis of model structure of 1MMB are shown. As shown in FIG. 40 and FIG. 41, intensity of fluctuation in 
55 dihedral angle <|> (orange A), y (green B) is shown. The closer to 0.0 is intensity of the fluctuation, the stronger the 
constraint of dihedral angle becomes in the molecular dynamic (MD) calculation. Secondary structures determined by 
STRIDE[8] are also shown: a-helix (red D) and (3-sheet (blue D). The purple C shows an active site. 
[0182] In FIG. 42, results of molecular dynamic (MD) calculation of X-ray structure and model structure of 1 MMB are 
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shown. Rms of a main chain atom in active site X-ray structure and model structure was determined. As shown in FIG. 
42, A: initial structure is X-ray structure, without constraint of dihedral angle; B: initial structure is X-ray structure, with 
constraint of dihedral angle; C: initial structure is model structure, without constraint of dihedral angle; and D: initial 
structure is model structure, with constraint of dihedral angle. 

5 [0183] As shown in Cases 1) to 4), the result of molecular dynamic calculation adapting constraint parameters exhibit 
less significant structural change compared to the case where constraint parameters are not adapted. This reveals that 
significant structural change can be reasonably constrained and ideal structure coordinates can be obtained by adopting 
constraint parameters in molecular dynamic method which results in large structural change due to theory of classical 
mechanics. When homology is high, the accuracy of modeled structure by FMAS is also improved. That is, since structure 

10 similar to X-ray structure is obtained, the present invention may be applied for mutation proteins substituted by several 
amino acids. 

[Third example] 

15 (Verification of protein/ligand complex model) 

[0184] By means of the ligand screening apparatus 1 00 in the above described embodiment, 3D structure of a ligand 
complex that binds to the target protein was predicted. In third example, prediction accuracy of the 3D structure coordinate 
of complex was examined. For this verification, an induced-fit type protein was used, which is known 3D structure of 
20 complex and has variable conformation of active site which varies depending on presence/absence of ligand or type of 
ligand. Residues located within 10 angstrom radius from each atom in the ligand were defined to form an active site of 
protein. Since it turned out that the structure is kept substantially equivalent in MD which uses X-ray structure or NMR 
structure as initial structure, we decided to conduct MD until 1.0 nanosecond. In the calculation, however, hydrogen 
atoms were excluded. Construction of complex model was conducted in accordance with the above embodiment. 

25 

Case 1) 

[0185] 1 BZF and 1 LUD, dihydrofolate reductase (DHFR) have 1 00% homologous but are different conformations of 
active site because separate ligands bind to protein. 1 BZF (MODEL 18) was selected as an initial structure, and using 

30 2,4-diamino-5-(3, 4, 5-trimethoxy-benzyl)-pyrimidin-1 -ium (FIG. 49), a protein/ligand complex model was created by the 
ligand screening apparatus 1 00 in the embodiment as described above, and examination was made by comparison with 
1LUD (MODEL4) which is the true structure (FIG. 43). In FIG. 43, 3D structure of dihydrofolate reductase was shown. 
In FIG. 43, receptor (green A) and ligand (red B) of 1 LUD (MODEL 4), as well as receptor (blue C) and ligand (light blue 
D) of 1 BZF (MODEL 1 8) were shown. 

35 [0186] In FIG. 44, analysis results of normal mode calculation of 1BZF was shown. In FIG. 44, intensity of fluctuation 
in dihedral angle (orange A), \\r (green B) is shown. The closer to 0.0 is intensity of the fluctuation, the stronger the 
constraint of dihedral angle becomes in the molecular dynamic (MD) calculation. Secondary structures determined by 
STRIDE are also shown: a-helix (red D) and (3-sheet (blue D). The purple C shows an active site. 
[0187] In FIG. 45 and FIG. 47, results of molecular dynamics with dihedral angle constrained of 1BZF were shown. 

40 in FIG. 45, rms with true structure, 1LUD (MODEL4) in the active site is shown. In FIG. 45, A is of main chain atom, B 
is of side chain atom, and C is of whole atom. FIG. 47 is a result of analysis of ligand binding to active site in 1BZF 
(MODEL 18). These are results of binding analyses when MD calculation was effected until 0.1 or 1.0 nanosecond, and 
dynamic structures of receptor are extracted at interval of 1 00 femtoseconds in theformer and 1 00 and 1 000 femtoseconds 
in the latter, three separate populations are created by clustering those. Evaluation was made based on rms with true 

45 structure of ligand in active site. In FIG. 46, parameter values for designating spatial points in ligand docking are shown. 
FIG. 46 shows information of structure -activity relationship obtained from 1 LUD (MODEL4). 

[0188] In FIGs. 48 to 50, comparisons between protein/ligand complex and true structure are shown. FIG. 48 shows 
binding between active site and ligand using a population created by clustering dynamic structure of receptor after those 
were extracted at interval of 100 femtoseconds within 0-1.0 nanosecond in the MD. Green A shows 1LUD (MODEL4) 

50 in true structure, blue B shows 1BZF (MODEL 18) in initial structure, and red C shows optimum structure in ligand 
binding, "by element" color D shows ligand in true structure; and light blue E shows ligand by calculation result. Rms of 
ligand was 0.9614. By causing the ligand shown in FIG. 50 to bind, induction of 0.2791 by rms occurred in the main 
chain atom in the active site. In FIG. 49, true structure (model 4 of 1 LUD) is shown in black, initial structure (model 18 
of 1BZF) is shown in gray, and optimum structure is shown in white. FIG. 50 shows 2, 4-diamino-5-(3,4,5-trimethoxy- 

55 benzyl)-pyrimidin-1-ium, which is a ligand of 1 LUD. 
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Case 2) 

[0189] 1YER and 1 YET, heat shock protein 90 (HSP90) have 100% homologous but are different conformations of 
active site because separate ligands bind to protein. Selecting 1 YER which does not bind to ligand as initial structure, 
5 and using geldanamycin as a ligand, examination was made by comparison with 1 YET which is true structure (FIG. 51). 
In FIG. 51, 3D structure of heat shock protein 90 is shown. Receptor (green A) and ligand (red B) of 1 YET and receptor 
(blue C) of 1 YER are shown. 

[0190] In FIG. 52, result of normal mode analysis of 1 YER is shown. In FIG. 52, intensity of fluctuation in dihedral 
angle (J> (orange A), \\r (green B) is shown. The closer to 0.0 is intensity of the fluctuation, the stronger the constraint of 
10 dihedral angle becomes in the molecular dynamic (MD) calculation. Secondary structures determined by STRIDE are 
also shown: a-helix (red D) and p-sheet (blue D). The purple C shows an active site. 

[0191] In FIG. 53 and FIG. 55, results of molecular dynamics with dihedral angle constrained using 1 YER are shown. 
Rms with true structure, 1 YET in the active site is shown. A is of main chain atom, B is of side chain atom, and C is of 
whole atom. FIG. 55 is a result of analysis of ligand binding to active site in 1 YER. These are results of binding analyses 

is when MD calculation was effected until 0.1 or 1 .0 nanosecond, and dynamic structure of receptor are extracted at interval 
of 100 femtoseconds in the former and 100 and 1000 femtoseconds in the latter, and the sepalate populations are 
created by clustering those. Evaluation was made based on rms with true structure of ligand in active site. In FIG. 54, 
parameter values for designating spatial points in ligand docking are shown. FIG. 54 shows information of structure- 
activity relationship obtained from 1YET. 

20 [0192] In FIGs. 56 and 57, comparison between protein/ligand complex and true structure is shown. FIGs. 56 and 57 
show binding of ligand in 1YER. FIG. 56 shows binding analysis between active site and ligand using a population 
created by clustering dynamic structure of receptor after those were extracted at interval of 100 femtoseconds within 
0-1.0 nanosecond in the MD. Green A shows 1YET in true structure, blue B shows 1YER in initial structure, and red C 
shows optimum structure in ligand binding, "by element" color D shows ligand in true structure; and light blue E shows 

25 ligand by calculation result. Rms of ligand was 1.2081. By causing the ligand shown in FIG. 57 to bind, induction of 
0.1 61 9 by rms occurred in the main chain atom in the active site. FIG. 57 shows geldanamycin which is a ligand of 1 YET. 

Case 3) 

30 [0193] 1A9U and 10UK, mitogen-activated protein kinase (MAP kinase), have 100% homologous but are different 
conformations of active site because separate ligands bind to protein. Selecting 1A9U as initial structure, and using a 
ligand contained in 1 0UK as a ligand, examination to compare with true structure of ligand in 1 0UK was made (FIG. 58). 
In FIG. 58, 3D structure of mitogen-activated protein kinase is shown. Receptor (green A) and ligand (red B) of 10UK 
and receptor (blue C) and ligand (light blue D) of 1 A9U are shown. 

35 [0194] In FIG. 59, normal mode calculation analysis of 1A9U is shown. In FIG. 59, intensity of fluctuation in dihedral 
angle § (orange A), y (green B) is shown. The closer to 0.0 is intensity of the fluctuation, the stronger the constraint of 
dihedral angle becomes in the molecular dynamic (MD) calculation. Secondary structures determined by STRIDE are 
also shown: a-helix (red D) and p-sheet (blue D). The purple C shows an active site. 

[0195] In FIG. 60 and FIG. 62, results of molecular dynamics with dihedral angle constrained using 1 YER are shown. 

40 FIG. 60 shows a result of MD calculation of 1 A9U. Rms with true structure, 1 0UK in the active site is shown in FIG. 60. 
A is of main chain atom, B is of side chain atom, and C is of whole atom. FIG. 62 shows a result of analysis of ligand 
binding to active site in 1A9U. These are results of binding analyses when MD calculation was effected until 0.1 or 1.0 
nanosecond, and dynamic structures of receptor were extracted at interval of 100 picoseconds in the former and 100 
and 1000 picoseconds in the latter, and three separate populations were created by clustering those. Evaluation was 

45 made based on rms with true structure of ligand in active site. 

[0196] In FIG. 61, parameter values for designating spatial points in ligand docking are shown. FIG. 61 shows infor- 
mation of structure -activity relationship obtained from 10UK. 

[0197] FIGs. 63 to 65 show comparison between protein/ligand complex and true structure. FIGs. 63 to 65 show 
binding of ligand in 1 AU9. FIG. 63 shows binding analysis between active site and ligand using a population created by 

50 clustering dynamic structures of receptor after those were extracted at interval of 100 femtoseconds within 0-1 .0 nano- 
second in the MD. Green A shows 10UK in true structure, blue B shows 1A9U in initial structure, and red C shows 
optimum structure in ligand binding, "by element" color D shows ligand in true structure; and light blue E shows ligand 
by calculation result. Rms of ligand was 1 .61 1 2. By causing the ligand shown in FIG. 65 to bind, induction of 0.1 871 by 
rms occurred in the main chain atom in the active site. In FIG. 64, true structure (1 OUK) is shown in black, initial structure 

55 (1 A9U) is shown in gray, and optimum structure is shown in white. FIG. 65 shows 4-[5-[2-(1-phenyl-ethylamino)-pyrimidin- 
4-yl]-1-methyl-4-(3-trifluoromethylphenyl)-1 H-imidazol-2-yl]-piperidine which is a ligand of 10UK. 
[0198] As shown in Cases 1 ) to 3), it was proved that a model of protein/ligand complex created by the ligand screening 
apparatus 1 00 can predict 3D structure of induced-fit type protein/ligand complex with high accuracy. 
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[Fourth example] 

(Application example to in silico screening using Fxa) 

5 [0199] By the ligand screening apparatus 100 in the above embodiment, a ligand having a possibility to bind to Fxa 
was screened from the compound database using 3D structure of Fxa (FIG. 66) which is one kind of serine protease. 
As 3D structure, 1 AIX was used, and as a ligand database, 3633 kinds of ligands extracted from the PDB database was 
used. In accordance with the above embodiment, in silico screening was conducted. The results are shown in FIG. 67. 
[0200] In FIG. 67, ligands to 100th from the top of interaction energy to 1AIX among the ligands in the compound 

10 database are shown. In FIG. 67, the bold character is a ligand contained in 1 AIX, the italic character is serine protease. 
"PDB code" means code of PDB in which the ligand is originally contained. In FIG. 67, a ligand originally contained in 
1AIX ranks 19th. 

[0201] A protein/ligand complex structure in 19th ranking is shown in FIG. 68 and FIG. 69. In FIG. 68, receptor is 
shown in white, and ligand of 1 AIX is shown in black. FIG. 69 shows a ligand in 1AIX. 

15 [0202] Every ligand in 35th, 38th and 80th rankings in FIG. 67 bind to serine protease. 

[0203] These structures and protein/ligand complex structures are shown in FIG. 70 and FIG. 71 , FIG. 72, FIG. 73, 
FIG. 74, and FIG. 75. In FIG. 70 and FIG. 71 , structure of a protein/ligand complex in 35th ranking is shown. In FIG. 70, 
receptor is shown in white, and ligand of 1 AUJ is shown in black. FIG. 71 shows a ligand in 1 AUJ. FIG. 72 and FIG. 73, 
structures of a protein /ligand complex in 38th ranking is shown. In FIG. 72, receptor is shown in white, and true ligand 

20 (1 FOR) is shown in black. The rms was 1 .500. FIG. 73 shows a ligand in 1 FOR. In FIG. 74 and FIG. 75, structures of 
a protein/ligand complex in 80th ranking are shown. As shown in FIG. 74, receptor is shown in white, and ligand of 1 K1 M 
is shown in black. Fig. 75 shows a ligand in 1 K1 M. 

[0204] These results revealed that feasible compounds can be selected from the compound database according to 
the present invention. 

25 

[Fifth example] 

(in silico screening under different conditions) 

30 [0205] We verified that the ranking varies depending on the information of structure-activity relationship (SAR) by the 
ligand screening apparatus 100 in the above embodiment. We verified variation in ranking when the receptor is fixed. 
[0206] Here, we executed in silico screening using protease of severe acute respiratory syndrome (SARS). As an 
initial structure, 1 UK3 (B chain) not containing a ligand, and a binding mode of 1 UK4 (B chain) including ligand was used 
as information of structure -activity relationship. The active site is a receptor residue region contained within 1 0 angstrom 

35 radius from each atom in the ligand of 1 UK4 (B chain). As a ligand database, 3633 kinds of ligands collected from PDB 
were used. As a receptor dynamic structure cluster for use in binding analysis, the population assembling the structure 
extracted at interval of 1 00 femtoseconds within the range of 0-0.1 nanosecond was used. The calculation was conducted 
with the exclusion of hydrogen atoms. 

[0207] FIG. 76 shows 3D structure of SARS protease. Receptor (green A) and ligand (red B) of IUK4 (B chain), as 
40 well as receptor (blue C) of 1 UK3 (B chain) are shown. 

[0208] In FIG. 77, a result of normal mode analysis of 1 UK3 (B chain) is shown. Also intensity of fluctuation in dihedral 

angle $ (orange A), \\r (green B) is shown. The closer to 0.0 is intensity of the fluctuation, the stronger the constraint of 

dihedral angle becomes in the molecular dynamic (MD) calculation. Secondary structures determined by STRIDE are 

also shown: oc-helix (red D) and p-sheet (blue D). The purple C shows an active site. 
45 [0209] In FIG. 78, a result of molecular dynamic calculation of 1 UK3 is shown. Fig. 78 shows rms between MD result 

of 1 UK3 (B chain) and 1 UK4 (B chain) in the active site. A is of main chain atom, B is of side chain atom, and C is of 

whole atom. 

Case 1) Designating four points in SAR 

50 

[0210] FIG. 79 shows spatial point designating within active sites of 1 UK4. FIG. 79 is of structure-activity relationship 
information obtained from 1 UK4(B chain). In FIG. 80, a result of in silico screening in 1 UK3 (B chain) is shown. 
[0211] In FIG. 81, 1UK3 are compared with 1UK4 which is a true structure. The ranking is 25th. Green A represents 
1UK4 (B chain), bleu B represents 1UK3 (B chain) in initial structure, red C represents optimum structure in ligand 
55 binding, "by element" cooler D represents a true structure of peptide ligand (ASN-SER-THR-LEU-GLN) of 1 UK4, and 
light blue E represents a calculated ligand. Rms of ligand was 2.5721. Rms with the true structure for main chain in 
active site was 1 .0248 in initial structure, and 1 .0792 in optimum structures. 

[0212] In FIG. 82 and FIG. 83, the first ranking of in silico screening is shown. In FIG. 83, (C8-R) hydantocidin 5'- 
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phosphate which is a ligand of 1QF4 is shown. 

Case 2) Designating three spatial points in SAR 

5 [0213] FIG. 84 shows spatial points designating within active sites of 1UK3. FIG. 84 shows structure-activity relationship 
information obtained from 1 UK3 (B chain). 

[0214] FIG. 85 shows a result of comparison between 1UK3 (B chain) and 1UK4 (B chain), and shows comparison 
of the optimum structure in 1 UK3 with true structure. The ranking is 49th. Green A represents 1 UK4 (B chain), bleu B 
represents 1 UK3 (B chain) in initial structure, red C represents optimum structure in ligand binding, "by element" cooler 
10 D represents a true structure of peptide ligand (ASN-SER-THR-LEU-GLN) of 1UK4, and light blue E represents a 
calculated ligand. Rms of ligand was 2.0057. Rms with the true structure for main chain in active site was 1.0248 in 
initial structure, and 1 .0469 in optimum structures. 

[0215] In FIG. 86, a result of in silico screening executed while designating three spatial points of SAR is shown. 

is Case 3) Designating five spatial points in SAR 

[021 6] FIG. 87 shows spatial points designating within active sites of 1 UK3. FIG. 87 is of structure -activity relationship 
information obtained from 1 UK3 (B chain). In FIG. 88 is shown a result of in silico screening (for example, high throughput 
screening) executed for five designated spatial points of SAR. 

20 [0217] FIG. 89 shows comparison between optimum structure and true structure in 1 UK3. FIG. 89 shows a result of 
comparison between 1 UK3 (B chain) and 1 UK4 (B chain). The ranking is second. Green A represents 1 UK4 (B chain), 
bleu B represents 1 UK3 (B chain) in initial structure, red C represents optimum structure in ligand binding, "by element" 
cooler D represents a true structure of peptide ligand (ASN-SER-THR-LEU-GLN) of 1 UK4, and light blue E represents 
a calculated ligand. Rms of ligand was 1 .2578. Rms with the true structure for main chain in active site was 1 .0248 in 

25 initial structure, and 1 .1 620 in optimum structures. 

Case 4) Changing designated atom type of ligand atom 

[0218] FIG. 90 shows spatial points designating within active sites of 1 UK3. FIG. 90 is of structure -activity relationship 
30 information obtained from 1 UK3 (B chain). In FIG. 91 is shown a result of in silico screening (for example, high throughput 
screening) executed while changing the designated atom type of ligand atom. 

[0219] FIG. 92 shows comparison between optimum structure and true structure in 1 UK3. FIG. 92 shows a result of 
comparison between 1UK3 (B chain) and 1UK4 (B chain). The ranking is 774th. Green A represents 1 UK4 (B chain), 
bleu B represents 1 UK3 (B chain) in initial structure, red C represents optimum structure in ligand binding, "by element" 
35 cooler D represents a true structure of peptide ligand (ASN-SER-THR-LEU-GLN) of 1 UK4, and light blue E represents 
a calculated ligand. Rms of ligand was 2.5216. Rms with the true structure for main chain in active site was 1 .0248 in 
initial structure, and 1 .0792 in optimum structures. 

Case 5) Fixed receptor 

40 

[0220] FIG. 93 shows spatial points designated within active sites. FIG. 93 is of structure-activity relationship information 
obtained from 1UK4 (B chain). In FIG. 94 shows shown a result of in silico screening (for example, high throughput 
screening) executed while fixing the receptor. 

[0221] FIG. 95 shows comparison between experimentally observed and calculated structures of ligand , when 1 UK3 
45 is superimposed on 1UK4. The ranking is 39th. Structure of active site of 1UK3 is shown in gray, the former ligand is 
shown in black, and the latter ligand is shown in white. 

[0222] As can be seen from Cases 1) to 4), the more SAR is designated, the better the ranking of the reference ligand 
is. That is, various ligands are caused to distribute in top ranking by conducting in silico screening with increased SAR 
information when binding information on reference ligand is reliable, and by reducing number of information of SAR and 
50 designating the relaxed range of atom types in ligand when the information is unreliable. More reliable result was produced 
when in silico screening was executed with the use of SRA information modified based on the distribution information. 
[0223] Case 1 ) and Case 5) demonstrate variation in ranking depending on the presence/absence of dynamic structure 
of receptor. Optimization of structures of ligand and receptor is superior in preventing collision of atoms to optimization 
of structure of only ligand. Therefore, difference arose in optimization energy for placing ligand at the same position. 

55 
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[Sixth example] 

(Distribution of MD parameter concerning parameters of molecular dynamic calculation with dihedral angle contrained) 

5 [0224] Now explanation will be given for distribution of parameter of MD with dihedral angle constrained in FMN- 
binding protein. Here, we verified whether similar results are obtained in molecular dynamic calculation with dihedral 
angle constrained and parameters of clustering even for N MR structure along with 1 LUD by means of the ligand screening 
apparatus 1 00 of the aforementioned embodiment. We selected MODEL 1 of NMR structure (PDB code:1 AXJ) of FMN- 
binding protein as an initial structure. As to the evaluation method, first example was followed except that parameters 

10 (<x=80.0%, cc=0.4 angstrom) for receptor dynamic structure clustering were fixed. 

[0225] In FIG. 96, distribution of scores for determining parameter in molecular dynamic calculation with dihedral angle 
constrained in 1 AXJ is shown. FIG. 96 shows distribution of parameter of MD with dihedral angle constrained in 1 AXJ. 
The closer to A, the smaller the score is. As is the case with 1 LUD, good results were obtained at maximum value 800 
and minimum value 0 of dihedral angle constraint. 

15 

[Seventh example] 

(MD with dihedral angle constrained) 

20 [0226] Here, we verified dynamic structure of each atom by main chain in MD with dihedral angle constrained by 
means of the ligand screening apparatus 100 in the aforementioned embodiment. There is sometimes the case that 
normal mode analysis does not converge and information of dihedral angle fluctuation is not obtained. Since good result 
is obtained when MD is conducted with a constant constraint (500) with respect to the main chain dihedral angle as 
shown in FIG. 13, weverified dynamic structure in this case. Following first example, MDwasconductedwithout constraint, 

25 with constraint using dihedral angle fluctuation or with constant constraint (500). In FIG. 97 to FIG. 108, results of dynamic 
behavior in each atom in molecular dynamic calculation executed for 1LUD are shown. 

[0227] FIG. 97 to FIG. 108 are results of MD calculation for 1 LUD (MODEL 1). FIG. 97 and FIG. 98 are results of 
dynamic behavior of main chain atom in active site, FIG. 99 and FIG. 100 are of main chain atom in receptor, FIG. 101 
and FIG. 102 are of side chain atom in active site, FIG. 103 and FIG. 104 are of side chain atom in receptor, FIG. 105 
30 and FIG. 106 are of whole atom in active site, and FIG. 1 07 and FIG. 108 are of whole atom in receptor. For each model 
structure of 24 kinds described in PDB file of 1LUD, rms with main chain atom in active site by MD of MODEL 1 was 
calculated, and an average rms was displayed by dotted lines. Difference of main atom in the active site from its initial 
structure was shown by rms in the presence (A) or absence (B) of dihedral angle constraint (A), or with constant dihedral 
angle constraint (C). 

35 [0228] Although not shown herein, with regard to 1CBQ, 1J9G, 1MMB, 1BZF (MODEL 18), 1YER, 1A9U and 1UK3 
(B chain), the results of constrained MD based on fluctuation of main chain dihedral angle demonstrated that a side 
chain atom not being constrained also exhibited a certain measurement of motion even if that of the main chain atom 
was constrained as is the case with FIGs. 109 to 111. It can be understood that motion of main chain atom heavily 
influences on the motion of the receptor. 

40 

[Eighth example] 

(Binding analysis under different conditions) 

45 [0229] Here, we verified that induction was caused even when different parameters were used in MD with dihedral 
angle constrained and clustering by means of the ligand screening apparatus 100 in the aforementioned embodiment. 
Second example was followed except that the maximum value of constraint was set at 1 00, the minimum value was set 
at 0, the clustering coefficients a and p of receptor dynamic structure were set at 80.0% and 1 .0 angstroms, respectively. 
For clustering of receptor dynamic structure, the structures were extracted at a interval of 1 00 femtoseconds within the 

50 range of 0 to 0.1 nanosecond was used, and a polulation of those was created. Receptor residues within 6 angstroms 
radius from each atom in the ligand are defined to form an active site. 

[0230] In FIG. 1 09 to FIG. 111, results of receptor/ligand binding under different conditions are shown. 

(i) In 1BZF (MODEL 18), binding of ligand caused induction of 0.2686 in rms of main chain atom in the active site 
55 (FIG. 109). In overall rms of active site, induction of 0.1224 was caused. Rms of ligand was 0.8526. 

(ii) In 1YER, binding of ligand caused induction of 0.22376 in rms of main chain atom in the active site (FIG. 1 10). 
In overall rms of active site, induction of 0.081 6 was caused. Rms of ligand was 0.7246. 

(iii) In 1 A9U, binding of ligand caused induction of 0.2150 in rms of main chain atom in the active site (FIG. 111). In 
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overall rms of active site, induction of 0.0464 was caused. Rms of ligand was 0.9464. 

[0231] Green lines show true structure, blue lines show initial structure, red lines show optimum structure, "by element" 
color lines show true ligand, and blue lines show optimum ligand. 
5 [0232] As shown in FIG. 1 09 to FIG. 111, optimum results were obtained within a given condition even if each condition 
differs. 

[Ninth example] 

10 (Binding analysis when true structure is selected as initial structure) 

[0233] Here, since binding modes of 1 BZF and 1LUD of DHFR resemble, binding analysis of ligand of 1BZF was 
conducted with partial modification of structure-activity relationship information by means of the ligand screening appa- 
ratus 100 in the aforementioned embodiment. Condition was such that 1 BZF (MODEL 18) was used as initial structure, 
is a cluster created from a population of 0 to 0.1 nanosecond was used. In FIG. 1 1 2, spatial point designated in active site 
of 1 BZF is shown. FIG. 1 12 is a view of structure-activity relationship information modified for 1 BZF. In FIG. 1 13 and 
FIG. 114, results of receptor/ligand binding in 1BZF are shown. In FIG. 113 and FIG. 114, results of ligand binding 
analysis of 1 BZF (MODEL 18) are shown. 

20 (i) As optimum structure of a receptor , initial structure was selected. Rms of ligand was 0.8884 (FIG. 113). 

(ii) Trimetrexate, which is a ligand for 1 BZF(MODEL 1 8) (FIG. 1 1 4). 

[0234] Since the initial structure was a structure that was originally registered in PDB, namely optimum structure, the 
calculation results could be reproduced as shown in FIG. 113 and FIG. 1 14. 

25 

INDUSTRIAL APPLICABILITY 

[0235] As described above, a ligand screening apparatus, a ligand screening method, and a program and recording 
medium according to the present invention seem to be very useful in the fields conducting analysis of receptor/ligand 
30 binding (drug design), especiallyformoleculardesign of pharmaceutical and agricultural chemicals. The present invention 
can be widely practiced in many industrial fields, especially in pharmaceutical, food, cosmetics, medical, structural 
analysis, functional analysis and the like fields, and hence are very useful. 



35 Claims 

1 . A ligand screening apparatus which screens for a ligand that binds to a protein when coordinate data of the protein 
of single chain or plural chains is given, the apparatus comprising: 

40 post-structural-change protein coordinate data selecting unit that effects structural change in consideration of 

dynamic behavior using induced-fit parameter reflecting induced fit on the coordinate data of protein and selects 
post-structural-change protein coordinate data; 

spatial point designating unit that designates a spatial point at which superposition with the ligand is to be 
conducted, from the post-structural-change protein coordinate data selected by the post-structural-change pro- 
45 tein coordinate data selecting unit; 

interaction function calculating unit that calculates an interaction function when the protein and the ligand bind 
to each other using the spatial point designated by the spatial point designating unit and a ligand coordinate 
data of the ligand; and 

ligand evaluating unit that evaluates the ligand that binds to the protein based on the interaction function cal- 
50 culated by the interaction function calculating unit. 

2. The ligand screening apparatus according to claim 1 , wherein the interaction function calculating unit calculates the 
interaction function using Score (i,j) shown in Formula 1. 

55 
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A 



(wherein dy s is a distance between i-th spatial point and j-th spatial point in the target protein. dy c is an interatomic 
distance between i-th atom and j-th atom in the compound, a is a coefficient for making Sscore(iJ) the maximum 
value when the group of spatial points in the target protein and the compound completely overlap with each other. 
(3 is a coefficient for giving a threshold value by which it can be defined as "overlapping") 

The ligand screening apparatus according to claim 1 or 2, wherein the interaction function calculating unit further 
comprises interaction function optimizing unit that carries out optimization so as to make the score of interaction 
function maximum. 

The ligand screening apparatus according to claim 3, wherein the interaction function calculating unit further com- 
prises: 

interaction energy optimizing unit that calculates interaction energy with the protein for the superposed ligand 
after optimization of the interaction function by the interaction function optimizing unit, and optimizes the inter- 
action energy while finely adjusting conformation of ligand 3D structure data. 

The ligand screening apparatus according to claim 4, wherein the ligand evaluating unit further comprises: 

reevaluating unit that executes the interaction function calculating unit after largely changing conformation of 
ligand 3D structure data following optimization by the interaction energy optimizing unit, and reevaluates the 
ligand that binds to the protein based on the interaction function calculated by the interaction function calculating 
unit. 

The ligand screening apparatus according to any one of claims 1 to 5, wherein in calculation of any one of the 
induced-fit parameter and the post-structural-change protein coordinate data or both, the post-structural-change 
protein coordinate data selecting unit calculates normal mode for the protein coordinate data, determines intensity 
of fluctuation of each amino acid, and conduct molecular dynamic calculation using the intensity of fluctuation as a 
constraint condition. 

The ligand screening apparatus according to claim 6, wherein the post-structural-change protein coordinate data 
selecting unit calculates a fluctuation value of dihedral angle of main chain according to normal mode calculation, 
and conducts molecular dynamic calculation while setting the fluctuation value as a coefficient of force K in the 
molecular dynamic calculation shown by Formula 2 or Formula 3. 

Erot = Krot((j)-(j)0) 2 {Formula 2] 

(wherein Erot represents energy of dihedral angle of main chain atom in 3D structure of a protein, represents 
dihedral angle of main chain atom. §0 represents standard value of dihedral angle of main chain atom. Here, when 
a value of Krot is large, <|> is constrained by §0.) 

EpOS = Kpos(r-rO) 2 [Formula 3] 

(wherein Epos represents position energy of main chain atom in 3D structure of a protein, r represents coordinate 



when i is not equal to j 
ax[exp {-( d ;-d^}-p\ 



when i is equal to j 



^ * ll— [Formula 1] 
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of main chain atom. rO represents standard value of coordinate of main chain atom. Here, when a value of Kpos 
is large, r is constrained by rO.) 

8. The ligand screening apparatus according to any one of claims 1 to 7, wherein the interaction function calculating 
unit uses the interaction function to which a dynamic property function representing dynamic property of protein is 
added as "elastic energy". 

9. The ligand screening apparatus according to claim 8, wherein the interaction function calculating unit adapts "U 
collision" as "elastic energy" which is a function shown by Formula 4 in consideration of local flexibility of protein. 

M N 

u collision =ZI>(',y) 

<p (/, j) = Kcollision * (Rcollision (/, j) - R) 2 

[Formula 4] 

(wherein M is number of atoms in an active site that prohibit collision, N is number of atoms of ligand. When 
interatomic distance R between an i-th atom of a main chain or a side chain with a little dynamic behavior in an 
active site, and j-th atom in the ligand is not more than collision distance "Rcollision (ij)", (j)(i,j) is calculated) 

10. The ligand screening apparatus according to any one of claims 1 to 7, wherein the interaction function calculating 
unit uses the interaction function to which a normal mode analysis result or secondary structure determination result 
of the protein is added as a dynamic property function that represents dynamic property of protein. 

1 1 . A ligand screening method which screens for a ligand that binds to a protein when coordinate data of the protein of 
single chain or plural chains is given, the method comprising: 

post-structural-change protein coordinate data selecting step that effects structural change in consideration of 
dynamic behavior using induced-fit parameter reflecting induced fit on the coordinate data of protein and selects 
post-structural-change protein coordinate data; 

spatial point designating step that designates a spatial point at which superposition with the ligand is to be 
conducted, from the post-structural-change protein coordinate data selected by the post-structural-change pro- 
tein coordinate data selecting step; 

interaction function calculating step that calculates an interaction function when the protein and the ligand bind 
to each other using the spatial point designated by the spatial point designating step and a ligand coordinate 
data of the ligand; and 

ligand evaluating step that evaluates the ligand that binds to the protein based on the interaction function 
calculated by the interaction function calculating step. 

12. The ligand screening method according to claim 11, wherein the interaction function calculating step calculates the 
interaction function using Score (i,j) shown in Formula 1. 



when i is not equal to j 




[Formula 1] 



when i is equal to j 



(wherein djj s is a distance between i-th spatial point and j-th spatial point in the target protein. dy c is an interatomic 
distance between i-th atom and j-th atom in the compound, a is a coefficient for making Sscore(iJ) the maximum 
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value when the group of spatial points in the target protein and the compound completely overlap with each other. 
(3 is a coefficient for giving a threshold value by which it can be defined as "overlapping") 

13. The ligand screening method according to claim 11 or 12, wherein the interaction function calculating step further 
comprises interaction function optimizing step that carries out optimization so as to make the score of interaction 
function maximum. 

14. The ligand screening method according to claim 13, wherein the interaction function calculating step further com- 
prises: 

interaction energy optimizing step that calculates interaction energy with the protein for the superposed ligand 
after optimization of the interaction function by the interaction function optimizing step, and optimizes the inter- 
action energy while finely adjusting conformation of ligand 3D structure data. 

15. The ligand screening method according to claim 14, wherein the ligand evaluating step further comprises: 

reevaluating step that executes the interaction function calculating step after largely changing conformation of 
ligand 3D structure data following optimization by the interaction energy optimizing step, and reevaluates the 
ligand that binds to the protein based on the interaction function calculated by the interaction function calculating 
step. 

16. The ligand screening method according to any one of claims 11 to 15, wherein in calculation of any one of the 
induced-fit parameter and the post-structural-change protein coordinate data or both, the post-structural-change 
protein coordinate data selecting step calculates normal mode for the protein coordinate data, determines intensity 
of fluctuation of each amino acid, and conduct molecular dynamic calculation using the intensity of fluctuation as a 
constraint condition. 

17. The ligand screening method according to claim 16, wherein the post-structural-change protein coordinate data 
selecting step calculates a fluctuation value of dihedral angle of main chain according to normal mode calculation, 
and conducts molecular dynamic calculation while setting the fluctuation value as a coefficient of force K in the 
molecular dynamic calculation shown by Formula 2 or Formula 3. 



(wherein Erot represents energy of dihedral angle of main chain atom in 3D structure of a protein. cj> represents 
dihedral angle of main chain atom. (J)0 represents standard value of dihedral angle of main chain atom. Here, when 
a value of Krot is large, (j> is constrained by §0.) 



(wherein Epos represents position energy of main chain atom in 3D structure of a protein, r represents coordinate 
of main chain atom. rO represents standard value of coordinate of main chain atom. Here, when a value of Kpos 
is large, r is constrained by rO.) 

18. The ligand screening method according to any one of claims 11 to 1 7, wherein the interaction function calculating 
step uses the interaction function to which a dynamic property function representing dynamic property of protein is 
added as "elastic energy". 

19. The ligand screening method according to claim 18, wherein the interaction function calculating step adapts "U 
collision" as "elastic energy" which is a function shown by Formula 4 in consideration of local flexibility of protein. 




[Formula 2] 




[Formula 3] 
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M N 

u co „ ision = £2> (/,./) 

(2? (/, y) = Kcollision * (Rcollision (/, j) - /?) 2 

[Formula 4] 

(wherein M is number of atoms in an active site that prohibit collision, N is number of atoms of ligand. When 
interatomic distance R between an i-th atom of a main chain or a side chain with a little dynamic behavior in an 
active site, and j-th atom in the ligand is not more than collision distance "Rcollision (i,j)", (|>(i,j) is calculated) 

20. The ligand screening method according to any one of claims 11 to 1 7, wherein the interaction function calculating 
step uses the interaction function to which a normal mode analysis result or secondary structure determination result 
of the protein is added as a dynamic property function that represents dynamic property of protein. 

21. A program which makes a computer execute a ligand screening method which screens for a ligand that binds to a 
protein when coordinate data of the protein of single chain or plural chains is given, the method comprising: 

post-structural-change protein coordinate data selecting step that effects structural change in consideration of 
dynamic behavior using induced-fit parameter reflecting induced fit on the coordinate data of protein and selects 
post-structural-change protein coordinate data; 

spatial point designating step that designates a spatial point at which superposition with the ligand is to be 
conducted, from the post-structural-change protein coordinate data selected by the post-structural-change pro- 
tein coordinate data selecting step; 

interaction function calculating step that calculates an interaction function when the protein and the ligand bind 
to each other using the spatial point designated by the spatial point designating step and a ligand coordinate 
data of the ligand; and 

ligand evaluating step that evaluates the ligand that binds to the protein based on the interaction function 
calculated by the interaction function calculating step. 

22. The program according to claim 21, wherein the interaction function calculating step calculates the interaction 
function using Score(iJ) shown in Formula 1 . 



when i is not equal to j 




[Formula 1] 



when i is equal to j 



_ax(l-/J) 

(wherein djj s is a distance between i-th spatial point and j-th spatial point in the target protein, dy 0 is an interatomic 
distance between i-th atom and j-th atom in the compound, a is a coefficient for making Sscore(iJ) the maximum 
value when the group of spatial points in the target protein and the compound completely overlap with each other. 
(3 is a coefficient for giving a threshold value by which it can be defined as "overlapping") 

23. The program according to claim 21 or 22, wherein the interaction function calculating step further comprises inter- 
action function optimizing step that carries out optimization so as to make the score of interaction function maximum. 

24. The program according to claim 23, wherein the interaction function calculating step further comprises: 

interaction energy optimizing step that calculates interaction energy with the protein for the superposed ligand 
after optimization of the interaction function by the interaction function optimizing step, and optimizes the inter- 
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action energy while finely adjusting conformation of ligand 3D structure data. 

25. The program according to claim 24, wherein the ligand evaluating step further comprises: 

reevaluating step that executes the interaction function calculating step after largely changing conformation of 
ligand 3D structure data following optimization by the interaction energy optimizing step, and reevaluates the 
ligand that binds to the protein based on the interaction function calculated by the interaction function calculating 
step. 

26. The program according to any one of claims 21 to 25, wherein in calculation of any one of the induced-fit parameter 
and the post-structural-change protein coordinate data or both, the post-structural-change protein coordinate data 
selecting step calculates normal mode for the protein coordinate data, determines intensity of fluctuation of each 
amino acid, and conduct molecular dynamic calculation using the intensity of fluctuation as a constraint condition. 

27. The program according to claim 26, wherein the post-structural-change protein coordinate data selecting step cal- 
culates a fluctuation value of dihedral angle of main chain according to normal mode calculation, and conducts 
molecular dynamic calculation while setting the fluctuation value as a coefficient of force K in the molecular dynamic 
calculation shown by Formula 2 or Formula 3. 

Erot = KTOt((j)-<l)0) 2 [Formula 2] 

(wherein Erot represents energy of dihedral angle of main chain atom in 3D structure of a protein. (j> represents 
dihedral angle of main chain atom. §0 represents standard value of dihedral angle of main chain atom. Here, when 
a value of Krot is large, § is constrained by $0.) 

Epos = Kpos (r - rO) 2 [Formula 3] 

(wherein Epos represents position energy of main chain atom in 3D structure of a protein, r represents coordinate 
of main chain atom. rO represents standard value of coordinate of main chain atom. Here, when a value of Kpos 
is large, r is constrained by rO.) 

28. The program according to any one of claims 21 to 27, wherein the interaction function calculating step uses the 
interaction function to which a dynamic property function representing dynamic property of protein is added as 
"elastic energy". 

29. The program according to claim 28, wherein the interaction function calculating step adapts "U collision" as "elastic 
energy" which is a function shown by Formula 4 in consideration of local flexibility of protein. 

M AT 

u colll5ion =ZZ^ (''■/') 

<p(i, j) = Kcollision * (Rcollision (/, j) -R) 2 

[Formula 4] 

(wherein M is number of atoms in an active site that prohibit collision, N is number of atoms of ligand. When 
interatomic distance R between an i-th atom of a main chain or a side chain with a little dynamic behavior in an 
active site, and j-th atom in the ligand is not more than collision distance "Rcollision (ij)", <|>(i,j) is calculated) 

30. The program according to any one of claims 21 to 27, wherein the interaction function calculating step uses the 
interaction function to which a normal mode analysis result or secondary structure determination result of the protein 
is added as a dynamic property function that represents dynamic property of protein. 
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31. A computer readable recording medium in which the program according to either of claims 21 to 30 is recorded. 
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1KCI 


030 


-1766.9010 


1KZK 


031 


-1728.2876 


6GSX 


032 


-1709.9359 


2PRG 


033' 


-1699.2351 


1NPW 


034 


-1694.4086 


2UPJ 


r 035 


7 -1661.4315 


1AUJ 


036 


-1658.1970 


1HFR 


037 


-1654.2430 


1DMP 


(038. 


' -1599.5870 


1F0R 


039 : 


-1595.7907 


2GSQ 


040 


-1569.9256 


1QHC 


041 ! 


-1530.3871 


1AIM 


042 


I -1481.1846 


1EL3 


043 : 


-1473.7372 


1QH5 


044 


-1453.3935 


1LHC 


045] 


-1411.1465 


1HFC 


046 


-1389.8129 


2FMB 


047 


-1372.1506 


1GFW 


048 


-1352.8868 


1EM6 


049 


-1329.5658 


1AU0 


050 : 


-1306.5704 


1M9B 


051 


-1287.3729 


1EAS 


052 


-1265.8962 


1LHE 


053 


-1248.8527 


1C8T 


054 ; 


-1244.2458 


1MMQ 


055 


-1216.6454 


1QIP 


056 : 


-1200.9810 


207D 


057 


-1175.5120 


1HWL 


058 


-1138.1881 


4UPJ 


059 


-1112.7163 


3GST 


060 


-1068.0641 


1LEE 


061 


-1030.5972 


1GA9 


062 


-1030.4960 


10D7 


063 


-1029.0345 


1HOV 


064 ; 


-1018.1686 


1LF2 


065 


-1011.9100 


10DY 


066 


-976.1041 


1CQQ 


067; 


-948.0992 


1G2K 


068! 


-936.9058 


2AIM 


069 


-934.4739 


1NWL 


070 


-924.6255 


6FIV 


071 


-902.7587 


1YEI 


[072 ; 


-900.4131 


1MXT 


073 


-894.5544 


1YEF 


074 


-874.9274 


1DZT 


075 


-857.5373 


1QF0 


076 : 


-851.1669 


1EGV 


077 


-844.2406 


1F29 


078 : 


-824.5393 


1KV2 


079 


-820.4913 


456C 


W80j 


] -775.9659 


1K1M 


081 


-766.8359 


1JR4 


082 


-763.2825 


2KCE 


083 


-739.3676 


1KIM4 


084 


-733.8593 


1RT2 


085 


-728.8765 


1HPV 


086 


-718.5795 


2BBQ 


087 


-705.3978 


1MS6 


088 


-695.0241 


1IF7 


089 


-689.7998 


1JIL 


090 : 


-684.7289 


1A8J 


091 


-676.3861 


1FL3 


092' 


-628.8081 


1CIZ 


093 


-619.2121 


1DIF 


094 : 


-604.7057 


2BPX 


095 


-598.4143 


1IF9 


096 : 


-564.5807 


1K0C 


097] 


-561.6472 


1KN2 


098 : 


-541.1021 


1HBV 


099] 


-507.6808 


1DB4 


100 


-496.0550 


1K1J 



BOLD: LIGAND CONTAINED IN 1AIX 
ITALIC: SERINE PROTEASE 
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FIG.68 




RANKING 19 
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FIG.70 




RANKING 35 
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FIG.72 




RANKING 38 



FIG.73 




LIGAND CONTAINED IN 1FOR 
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FIG.74 




H 2 N NH 2 
LIGAND CONTAINED IN 1KIM 



84 



EP 1 724 697 A1 



FIG.76 




FIG.77 
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FIG.78 
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FIG.79 



ATOM OF 
ACTIVE SITE 


ATOM TYPE OF 
LIGAND 


INTENSITY OF 
INTERACTION 


DISTANCE OF 
INTERACTION [A] 


CYS145 N 


Oco2 


100 


2.70 


MET165 CG 


C.3 


100 


4.00 


GLU166 N 


0.2 


100 


2.70 


THR190 N 


0.3 


100 


2.70 
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FIG.80 



I RANKING 


ENERGY 


PDB code 


REMARKS ! 


1 


A Ann nj o 

-1089.2153 


1QF4 


ligase 


2 


-990.9917 


I 1 KZL 


transferase 


3 


-906.5003 


1C0A 


ligase/RNA 


4 


-889.1661 


1KGQ 


transferase 


5 


-869.3531 


1195 


ribosome 


6 


-860.2331 


1JR4 


transferase 


7 


-858.0005 


1A2N 


transferase 


8 


-832.0515 


1NKK 


hydrolase 


9 


-TOO <*) r J r 

-788.3545 


1JIL 


ligase 


10 


-757.2852 


1 EJB 


transferase 


1 1 


-697.9477 


1DMT 


hydrolase 




-b4o. 0^:69 


1HAU 


complex (protease/inhibitor) 


13 


coo h ncn 

-oJo.l^bU 


1F74 


lyase 


I** 


AOfl Gfi7R 

-D^o.yo / O 


1 KYU 


endocytosis/exocytosis 


I O 


-til 


a Kino 

1 NKb 


serine proteinase/receptor 


I D 


-ouo. z + toy 


Q I V*7 

yLYZ. 


hydrolase (o-glycosyl) 


17 


fifiA 977R 


1 tiu 


lipid-binding protein 


I o 




lr / D 


lyase 


19 


-585.7663 


1LMW 


complex (serine protease/inhibitor) 


20 


-584.0059 


1R1R 


oxidoreductase 


21 


-580.1563 


1IL2 


ligase/RNA 


22 


-573.0481 


1BLL 


hydrolase(alpha-aminoacylpeptide) 


23 


-572.6763 


1E1F 


glycoside hydrolase 


24 


-540.1965 


1LKL 


complex (tyrosine kinase/peptide) 


25 


-524.2817 


1UK4 


hydrolase 


26 


-518.3528 


1LCB 


transferase (methyltransferase) 


27 


-506.8123 


1PGN | 


oxidoreductase (choh(d)-nadp+(a)) 


28 


-493.5477 


1I5Q 


hydrolase 


29 


-486.8954 


1KYD 


endocytosis/exocytosis 


| 30 


-481 .9659 


1NRR [serine proteinase/receptor 
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FIG.81 




FIG.82 
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FIG.83 



O 




FIG.84 



ATOM OF 
ACTIVE SITE 


ATOM TYPE OF 
LIGAND 


INTENSITY OF 
INTERACTION 


DISTANCE OF 
INTERACTION [A] 


CYS145 N 


0.co2 


100 


2.70 


j GLU0D166 N 


0.2 


100 


2.70 


THR190 N 


0.3 


100 


2.70 
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FIG.85 
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FIG.86 



RANKING 


ENERGY 


PDB code 


REMARKS j 


1 


-1263.8870 


1EAD 


dihydrolipoamide acetyltransferase 


2 


-1260.8689 


1F6M 


oxidoreductase 


3 


-1147.1739 


1JR4 


transferase 


4 


-1141.9917 


1QF5 


ligase 


5 


-1104.9447 


| 1JAY 


structural genomics 


6 


-1019.3584 


1KZL 


transferase 


7 


-996.5865 


! 1QF4 


ligase 


8 


-988.6588 


1JIJ 


ligase 


9 


-981.8594 


8ICO 


complex (nucleotidyltransferase/dna) 


10 


-953.0986 


1L09 


hydrolase 


11 


-949.1903 


1JTU 


transferase 


12 


-922.4795 


1JKX 


transferase 


13 


-918.4892 


1JIL 


ligase 


14 


-916.9950 


1195 


ribosome 


15 


-908.4880 


1AL6 


lyase 


16 


-893.5862 


1LKL 


complex (tyrosine kinase/peptide) 


17 


-892.3713 


1N37 


deoxyribonucleic acid 


18 


-887.9721 


1LCB 


transferase (methyltransferase) 


19 


-866.9600 


109F 


protein binding 


20 


-826.4893 


1L07 


hydrolase 


21 


-792.0254 


4UAG 


ligase 


22 


-776.9998 


1EJB 


transferase 


23 


-772.2400 


1BFZ 


n-terminal product peptide 


24 


-769.6844 


1F9E 


apoptosis 


25 


-762.5275 


1TLP 


hydrolase (metalloproteinase) 


26 


-759.8312 


1QIN 


lyase 


27 


-758.2140 


1K06 


transferase 


28 


-757.5526 


1C0A 


ligase/RNA 


29 


-755.7987 


1QD1 


transferase 


30 


-755.1049 


1L08 


hydrolase j 










49 


-639.1858 


1UK4 


hydrolase 
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FIG.87 



ATOM OF 
ACTIVE SITE 


ATOM TYPE OF 
LIGAND 


INTENSITY OF 
INTERACTION 


DISTANCE OF 
INTERACTION [A] 


THR25 OG1 


N.am 


100 


3.80 


i CYS145 N 


Oco2 


100 


2.70 


MET165 CG 


C.3 


100 


4.00 


GLU166 N 


0.2 


100 


2.70 


THR190 N 


0.3 


100 


2.70 



FIG.88 



RANKING 


ENERGY 


PDB code 


REMARKS 


1 


-364.6548 


1195 


ribosome 


2 


-299.0166 


1UK4 


hydrolase 


3 


-109.6867 


1BXX 


endocytosis/exocytosis 


4 


-93.0540 


1KZL 


transferase 


5 


-72.9399 


1NKK 


hydrolase 


6 


-10.7565 


1F8H 


endocytosis/exocytosis 


7 


-4.2756 


1QTN 


apoptosis 


8 


162.1557 


1KGQ 


transferase 


9 


163.2075 


109F 


protein binding 


10 


331 .8725 


1CGL 


metalloprotease 


11 


370.5027 


2BBQ 


transferase(methyltransferase) 


1 ? 


^Q7 Rzlfift 
JO/ .o*too 


4UIVIK 


oxiuoreauctase 


13 


550.2598 


1HPG 


hydrolase (serine protease) 


14 


716.6561 


i 1LOC 


lectin 


15 


839.7398 


1DMT 


hydrolase 


16 


848.7090 


1KAP 


zinc metalloprotease 


17 


850.2630 


1JG3 


transferase 


18 


883.4400 


1BC5 


complex (methyltransferase/peptide) 


19 


905.9695 


1FCH 


signaling protein 


20 


913.9769 


1CF8 


catalytic antibody 


21 


1088.2428 


1NWE 


hydrolase 


22 


1089.3496 


1K06 


transferase 


23 


1.116.9042 


1F74 


lyase 


24 


1131.4783 


1ING 


hydrolase (o-glycosyl) 


25 


1132.3648 


1131 


endocytosis/exocytosis 


26 


1148.9063 


1IAU 


hydrolase 


27 


1156.0335 


1B48 


transferase 


28 


1160.3512 


1PTT 


complex (hydrolase/peptide) 


29 


1176.7814 


1MC5 


oxidoreductase 


30 


1197.3565 


1F9E 


apoptosis 
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FIG.89 




FIG.90 



ATOM OF 
ACTIVE SITE 


ATOM TYPE OF 
LIGAND 


INTENSITY OF 
INTERACTION 


DISTANCE OF 
INTERACTION [A] 


CYS145 N 


ACCEPTOR 


100 


2.70 


MET165 CG 


CARBON 


100 


4.00 


GLU166 N 


ACCEPTOR 


100 


2.70 


i THR190 N 


ACCEPTOR 


100 


2.70 
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FIG.91 



! RANKING 


ENERGY 


PDB code 


REMARKS 


i 1 


-2095.8588 


1JJQ 


hormone/growth factor 


2 


-2011.3626 


2BVW 


hydrolase 


! 3 


i -1670.8384 


1DOG 


hydrolase 


4 


-1336.7960 


1LWJ 


transferase 


5 


-1320.0704 


1KEU 


lyase 


6 


-1230.0604 


1GAH 


hydrolase 


7 


-1214.9459 


1I7E 


signaling protein 


8 


-1195.8653 


1C39 


signaling protein 


9 


-1191.3777 


1BB5 


hydrolase 


10 


-1189.0253 


2FHI 


nucleotide-binding protein 


11 


-1147.9761 


L 1GG6 


glycopeptide antibiotics 


12 


-1103.6272 


| 1M4D 


transferase 


13 


-1095.3050 


; 1QHC 


hydrolase 


14 


| -1088.7299 


| 1M2N 


gene regulation 


15 


-1078.3684 


I 1QGL 


lectin (agglutinin) 


16 


-1056.4078 


4ENG 


glycosyl hydrolase 


17 


-1033.0227 


1LON 


ligase 


18 


-1031.2555 


1MWL 


ribonucleic acid 


19 


-1027.4239 


1QPK 


hydrolase 


20 


-1014.9817 


1UDB 


isomerase 


21 


-1005.1689 


1GQC 


transferase 


22 


-976.9293 


1H6H 


px domain 


23 


-975.2827 


1LSP 


hydrolase (o-glycosyl) 


24 


-973.5218 


1FF1 


signaling protein 


25 


-963.4098 


3UAG 


ligase 


26 


-937.2165 


1IBG 


immunoglobulin 


27 


-933.6818 


1 DRV 


oxidoreductase 


28 


-918.6947 


2MBR 


oxidoreductase 


29 


-917.1703 


1 NAB 


deoxyribonucleic acid 


30 


-897.3026 


1SLY 


glycosyltransferase 










774 


331.9928 


1UK4 


hydrolase 
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FIG.92 




FIG.93 



ATOM OF 
ACTIVE SITE 


ATOM TYPE OF 
LIGAND 


INTENSITY OF 
INTERACTION 


DISTANCE OF 
INTERACTION [A] 


CYS145 N 


0.co2 


100 


2.70 


MET165 CG 


C.3 


100 


4.00 


GLU166 N 


0.2 


100 


2.70 


THR190 N 


03 


100 


2.70 
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FIG.94 



| RANKING 


I ENERGY 


PDB code 


REMARKS 


1 


-1047.3743 


1KZL 


transferase 


2 


-860.437 


1J71 


hydrolase 


3 


-844.8737 


3UAG 


ligase 


4 


-837.6255 


1LKL 


complex (tyrosine kinase/peptide) 


5 


-829.8176 


1QF4 


ligase 


6 


-732.2087 


1A2N 


transferase 


7 


-721.6213 


1G1F 


hydrolase, signaling protein 


8 


-698.5922 


1F7B 


lyase 


9 


-689.1472 


1BFZ 


n-terminal product peptide I 


10 


-646.7943 


148L 


hydrolase(o-glycosyl) 


11 


-634.4654 


1CGL 


metalloprotease 


12 


-629.1673 


1JIL 


ligase 


13 


-616.8733 


1FF1 


signaling protein 


14 


-611.1171 


1F9E 


apoptosis 


15 


-567.0738 


1R1R 


oxidoreductase 


16 


-554.5321 


1195 


ribosome 


17 


-547.2494 


1FQX 


hydrolase 


18 


-536.7069 


1HCT 


complex (signal transduction/peptide) 


19 


-531.1014 


1SIA 


mucin motif 


20 


-508.9899 


1JIJ 


ligase 


21 


-507.9655 


1LSP 


hydrolase (o-glycosyl) 


22 


-497.6341 


1F8H 


endocytosis/exocytosis 


23 


-492.3974 


1F74 


lyase 


24 


-443.232 


1QH5 


hydrolase 


25 


-427.5925 


1JII 


ligase 


26 


-417.4991 


1JQY 


toxin 


27 


-416.9956 


2KCE 


methyltransferase 


28 


-396.7898 


1EJB 


transferase 


29 


-387.6441 


1MMJ 


hydrolase 


30 


-358.2162 


1SLY 


glycosyltransferase 










39 


-245.9500 


1UK4 


hydrolase 
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FIG. 95 




FIG.96 




MINIMUM 
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FIG.97 
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FIG. 107 
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FIG.109 




FIG.110 
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FIG.111 




FIG. 112 



ATOM OF 
ACTIVE SITE 


ATOM TYPE OF 
LIGAND 


INTENSITY OF 
INTERACTION 


DISTANCE OF 
INTERACTION [A] 


LUE4 O 


N.pl3 


100 


2.87 


ASP26 OD1 


N.ar 


300 


3.00 


ASP26 OD2 


N.pl3 


300 


3.00 
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FIG.113 




FIG. 114 




NH 2 
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FIG.115 

( START ^ 



PREPARE LIGAND DATABASE 



SO 



SELECT AND ACQUIRE TARGET PROTEIN 3D - 810 
STRUCTURE 



CALCULATE NORMAL VIBRATION OF 
TARGET PROTEIN 



MOLECULAR DYNAMIC CALCULATION OF 
TARGET PROTEIN (WITH CONSTRAINT) 



SELECT TYPICAL SPATIAL POINT ON 
REFERENCE TARGET PROTEIN 



SUPERPOSE ONE OF LIGAND COORDINATES 
IN STEP 0 TO POINT GROUPS IN STEP 40 



OPTIMIZE INTERACTION FUNCTION WITH 
TARGET PROTEIN WHILE FINELY ADJUSTING 
CONFORMATION OF LIGAND 



I 



LARGELY MODIFY CONFORMATION OF 
LIGAND 



S20 
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S40 
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S60 



S70 



DETERMINE INTERACTION ENERGY 
BETWEEN TARGET PROTEIN AND LIGAND 



I 



RETURN TO STEP 40, SELECT OTHER 
LIGAND IN STEP 0. AND REPEAT 
CALCULATION UP TO S80 



S80 



S90 



SELECT LIGAND 



S100 
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FIG.117 



,.102c 

INTERACTION FUNCTION 
CALCULATING UNIT 

INTERACTION FUNCTION I ! 
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